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Abstract 

A quantum probability model is introduced and used to explain hu- 
man probability judgment errors including the conjunction, disjunction, 
inverse, and conditional fallacies, as well as unpacking effects and par- 
titioning effects. Quantum probability theory is a general and coherent 
theory based on a set of (von Neumann) axioms which relax some of 
the constraints underlying classic (Kolmogorov) probability theory. The 
quantum model is compared and contrasted with other competing expla- 
nations for these judgment errors including the representativeness heuris- 
tic, the averaging model, and a memory retrieval model for probability 
judgments. The quantum model also provides ways to extend Bayesian, 
fuzzy set, and fuzzy trace theories. We conclude that quantum infor- 
mation processing principles provide a viable and promising new way to 
understand human judgment and reasoning. 

Over 30 years ago, Kahneman and Tversky [1] began their influential pro- 
gram of research to discover the heuristics and biases that form the basis of 
human probabiUty judgments. Since that time, a great deal of new and chal- 
lenging empirical phenomena have been discovered including conjunction, dis- 
junction, conditional, inverse, and base rate fallacies [2]. Although heuristic 
concepts (such as representativeness and availability) initially served as a guide 
to researchers in this area, there is a growing need to move beyond these in- 
tuitions, and develop more coherent, comprehensive, and deductive theoretical 
explanations [3]. The purpose of this article is to propose a new way of under- 
standing human probability judgment using quantum probability principles [4]. 
Quantum principles have been used recently in a number of psychological appli- 
cations including perception [5], conceptual structure [6], information retrieval 
[7], and human judgments [8].^ 



^ There is another independent line of research that uses quantum physical models of the 
brain to understand consciousness [9] and human memory [10] . We are not following this line, 
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The concept of a probability judgment error requires a standard or norm, 
and in the past, this norm was based on the Kolmogorov axioms for classic 
probability theory [11]. Classic theory is based on the assignment of probabilities 
to events defined as sets, and the Boolean logic entailed by using sets seems to 
be the source of the problems that occur with applications to human judgments. 
Quantum probability provides a more general geometric approach to probability 
theory that remains coherent but relaxes some of the constraints of Boolean 
logic [12]. Thus quantum probability provides an opportunity to explain what 
appears to be judgmental 'errors' with respect to the classic definition, but at 
the same time, it provides a quantum logical 'rationale' for human probability 
judgments. 

The remainder of this article is organized as follows. First we provide some 

background information and review basic findings. Second, we provide a simple 
and elementary introduction to quantum probability theory and apply these 
ideas to the basic findings. Finally, we summarize previous theoretical explana- 
tions, compare the advantages and disadvantages of the quantum model with 
the previous models, and indicate directions for future research. 

1 Background and Brief Review 

This article is mainly concerned with the explanation of conjunction and dis- 
junction fallacies, and so the following review and later theoretical analyses 
focus on these two basic issiies. However, it is important to briefly examine 
how well this explanation extends to some closely related phenomena, including 
conditional and inverse fallacies and 'unpacking' efii'ects. Therefore, although 
wc focus on conjunction and disjunction fallacies, we also briefly examine some 
closely related fallacies. This review addresses the many qualitative (ordinal 
level) findings that have been discovered over the past 30 years. 

In many probability judgment studies, a story is provided which is followed 
by questions about the likelihood of events related to the story (e.g., a story 
about a liberal philosophy student from Berkeley named Linda is presented, 
and questions are asked about her future activities). Sometimes very little 
story is needed (e.g. a time and a place) and there is simply a causal connection 
between story events (e.g., an increase in cigarette tax is passed, and then a 
decrease in teenage smoking occurs). Some of the key experimental factors that 
are manipulated in these studies include the following. Questions about events 
can be related by referring to the same person (e.g., 'Linda is a bank teller', 
'Linda is active in feminist movement') or unrelated by referring to different 
people ('Linda is active in feminist movement', 'Bill is shy'). Questions about 
events can have high likelihood (e.g., 'Linda is active in feminist movement') or 
a low likelihood ('Linda is a bank teller'). Questions can be about events with 
positive (e.g., 'Bill enjoys jogging and Bill plays soccer ') or negative or zero 
dependencies (e.g.. Bill is an accountant and Bill likes jogging'). 

and instead we are using models at a more abstract level analogous to Bayesian models of 
cognition. 
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Questions about generic events are labeled by letters such as A and B. We 
use the letters H and L to denote questions about events that have a high or 
low likelihood, respectively. Sometimes, subscripts on the letters will be used to 
distinguish questions about events that are related or unrelated. For example, 
Ai and Bi refer to events that are related (e.g., Tei has blue eyes, Tei has blond 
hair); Ai and B2 refer to events that are unrelated (e.g., Tei has blue eyes, 
Jerry has blond hair). When no subscripts appear, it can be assumed that the 
events are related. The probabilities of interest include questions about a single 
event (e.g., 'is a A true?'), a negation of a question ('is not A true?' symbolized 
as ~A), a conjimctivc question about events ('is A and B true?' symbolized 
A A B), a disjunctive question about events ('is A 01 B true?' symbolized 
A V B), and a question about an implication ('if A is true, then is B true?', 
symbolized as A 1-^ B). The symbols A and V represent the classic Boolean logic 
conjunction and disjunction relations, which are commutative, {A/\B) ^ (BAA) 
and (AV B) ^ {B V A) and distributive Aa{BV~B) ^ (A A B) V {A A 
~B). The implication is not commutative {A ^ B) ^ (B 1-^ A). These logical 
properties are intended by the experimenter asking the questions, but they may 
not necessarily be treated this way by human judges when answering questions 
about these events. Later, when various theoretical explanations for the findings 
are presented, different symbols are used for negation, conjunction, disjunction, 
and implication, because the formal properties of these logical relations differ 
across theories. 

Participants are asked to judge probabilities for questions about events, and 
these judgments are denoted by the letter J. The judged probabilities corre- 
sponding to the single, negation, conjunction, union, and implication questions 
about events are denoted J{A) , J{~A), J{A A B), J{A V B), and J{A B). 
These judgments may be obtained using a choice response (e.g. which event is 
more likely), or rank ordering the likelihood of a list of events, or rating each 
event (e.g. what are the chances out of 100 that an event is true), and sometimes 
they are inferred from bets (e.g. decide which event you want to bet money). 
To evaluate whether or not a fallacy or judgment error occurs, one needs to 
compare the distribution of judgments across participants for one event with 
another. This is usually done using two methods: One is to compare the means 
(or medians) of the two distributions and determine whether the difference is 
statistically significant; the second is to compare the frequency of the correct 
versus incorrect orders and determine whether the frequencies are statistically 
different. These two methods usually but not always give the same answer when 
they are both reported. 

1.1 Basic Findings 

As mentioned earlier, this article is primarily concerned with conjunction and 
disjunction fallacies and some other closely related fallacies.^ Figure 1 provides 

^ There is a large literature on inference that we plan to address in future work, but not at 
this time. In particular, we do not address the large literature on the insensitivity to base-rates 
in Bayesian inference tasks [13]. 
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a general overview of (a) the magnitude of conjunction errors [14] in the top 
panel and (b) the magnitude of disjunction errors [15] in the bottom panel. 

This figure plots the means of J{H) and J{L) along the X,Y axes, and the Z 
(vertical) axis has the mean of J{H AL) — J{L) for the conjimction error and the 
mean of J{H) — J{H\/ L) for the disjunction error. The 36 open circles {N = 40 
observations per circle) in the top panel are from Table 1 of Gavansky & Roskos- 
Ewoldsen (1991) in which people judged the conjunction after the constituents; 
the 24 solid dots (iV = 50 observations per dot) in the top panel are from 
Table 2 of Gavansky and Roskos-Ewoldsen (1991) in which people judged the 
conjimction before the constituents, and the 18 circles {N = 88 observations 
per circle) are from Experiment 2 of Fisk (2002) in which people judged the 
disjunction and the constituents in a randomized order. Points that lie above 
zero on the Z-axis indicate an error for the means. When the conjimction 
was rated last, five large (greater than .10) conjunction errors occurred and 
they all occurred for J{L) < .3 and J{H) > .8; when the conjunction was 
rated first, 7 large (greater than .10) conjunction errors occurred, and they all 
occurred for J{L) < .40 and J{H) > .70; 8 large (greater than .10) disjunction 
errors occurred, and all but two occurred for J{L) < .30 and J{H) > .60.^ In 
summary, large mean conjunctive and disjunctive errors tend to occiir with a 
high-low combination, they tend to disappear when J{L) is approximately equal 
to J{H), and more errors occur when the conjunction is rated first as compared 
to last. Next we consider how various other factors moderate these effects, and 
we also review some other closely related probability judgment errors. 

Fl. Conjunctive fallacy: J{HAL) > J{L) [16]. This has been found compar- 
ing means, medians, and frequencies. For example, when presented the liberal 
Linda story, 85% of 142 participants chose the event 'bank teller and feminist' as 
more likely than 'bank teller' in a direct choice between these two events. This 
high rate of conjunction errors persists even when both conjunctions, (HAL) as 
well as {H A ~L), are included in the list [17]. Other examples include a Norwe- 
gian student story with J(blue eyes and blond hair) > J(blue eyes), a medical 
example with J(age over 50 and heart attack) > J(heart attack), and a state tax 
example with J(increase tax and reduce cigarette smoking) > J(reduce cigarette 
smoking). These results occur using within and between subject designs; choice, 
ranking, and rating response methods (Tversky & Kahneman, 1983) as well as 
betting methods [18], and even when participants are paid for being 'correct' 
[19]. The findings occur with naive (undergraduates) and sophisticated (e.g. 
physicians) judges, but it is reduced for participants who have studied statistics 
(Tversky & Kahneman, 1983). 

F2. Disjunctive fallacy: J{H V L) < J{H) This was found comparing fre- 
quencies [20] and means [15]. For example, the Linda story produces J(feminist 
or bank teller) < J(feminist). 

F3. Both fallacies together: J{H) > J{H V L) > J{H A L) > J{L) [21]. 
Using the liberal Linda story, Morier and Borgida (1984) reported the following 

3The two exceptions were J(L) = .19, J{H) = .42, J{H V L) = .29 and J{L) = .62, 
J{H) = .85, J{H V L) = .72. 
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Figure 1: Top panel shows the mean conjunction effect from 60 experimental 
conditions, and bottom panel shows mean disjunction effect from 18 experimen- 
tal conditions. 



means: J(fcminist) = .83 > J(fcminist or bank teller) = .60 > J( feminist and 
bank teller) = .36 > J(bank teller) = .26 {N = 64 observations per mean, and 
the differences are statistically significant). 

F4. Containment error: J{B) < J{A) where B = {A\/ {~AaB)) [22]. This 
was found comparing mean ranks and frequencies. For example, a photo of an 
Alpine scene produces J(photo is from Switzerland) > J(photo is from Europe) 
where of course Europe includes Switzerland and the rest of Europe other than 
Switzerland. 

F5. Unpacking effects. Implicit subadditivity refers to the order J{A) < 

J{{A/\B)\/ {A/\~ B)). This was found comparing medians. For example, a story 
about causes of death produces J(death by Homicide) < J( death by homicide 
from an acquaintance V death by homicide from a stranger) [23]. The A event 
is called the packed event, and the (A A B) V (A A ~B) event is the unpacked 
event.'* However, when the event A is unpacked into an unlikely event B and a 
residual, then the opposite effect occurs where J{A) > J{{A A B) V (A A ~B)) 
[25]. 

F6. Partitioning effect: The probability judgment given to an event A is 
greater when the alternative is described as the negated event ~A (called the 
case partition) as opposed to a partition equivalent to the negated event {Bi V 
i?2--- V Bn) < — > ~A (called the class partition) [26]. This was found comparing 
medians. For example, people judge the event 'Sunday will be hotter than any 
other day next week' (the case based partition) to be greater than 'the hottest 
day next week will be Sunday' (the class based partition). 

F7. Conditional Fallacy: J{H i— > L) < J{L A H). For example, when given 
a story about an overcast November day in Seattle, the following results were 
obtained from 150 participants using medians: J(it rains)= .21 > J{ it rains 
and temperature remains below 38° F) = .18 > J(temperature remains below 
38°F) = .14 > J( if it rains then the temperature remains below 38°i^) = .12 
[27]. Although the difference between the medians for the conjunction (.18) 
and the implication (.12) is small, it was statistically significant. However, the 
heart attack example produces J(if age is over 50 then heart attack) = .59 > 
J(age over 50 and heart attack) = .30 > J(heart attack) = .18 (Tversky and 
Kahneman, 1983), and no differences have also been reported for the example 
J(if increase tax then reduction in cigarette smoking) = J(increase in tax and 
reduction in cigarette smoking) [28] . So this result seems to depend on the type 
of problem. 

F8. Inverse fallacy: J{A i-^ B) = J{B A) . This is found using 
both means and frequencies. For example J(if test is positive then disease 
is present) = J( if disease is present then test is positive) [29]. This result 
occurs with equally likely base rates for the disease, but unequal likelihoods for 
the test result given the disease, and so it is not explained by base rate neglect 

^This article focuses on implicit subadditivity /superadditivity, because it only requires an 
ordinal comparison of two judgments. Explicit subadditivity /superadditivity [24] is based 
on the comparison of one judgment with the sum of several other judgments, and the latter 
requires much stronger measurement assumptions. The quantum explanation for implicit 
unpacking also applies to explicit unpacking. 
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[30]. Obviously, this finding also depends on the type of problem. For example, 
if a person is murdered, then everyone would agree that the person is certainly 
dead; but no one would believe that if a person is dead, then the person was 
certainly murdered. 

F9. Averaging error: If J{H) > J{M) > J{L), then J(L) < J(L A M) and 
J{H) > J{HaM) [31]. This was found comparing means. For example, using a 
boring but intellectual Bill story produces J(Bill plays in a rock band) < J(Bill 
plays in a rock band and Bill is a park ranger), but J(Bill builds radio gliders) > 
J( Bill builds radio gliders and Bill is a park ranger). 

FIO. Violation of independence: J{A A C) > J{B A C) but J{A A D) < 
J{B A D) [32]. This was found comparing frequencies. For example, using the 
story of a college applicant named Joe produces J(accepted at Harvard and 
accepted at Princeton) > J(rejected at Oklahoma and accepted at Princeton) 
but J(accepted at Harvard and rejected at Texas) < J(rejected at Oklahoma 
and rejected at Texas). 

Fll. Effect of event dependencies. The presence of dependencies between 
events A and B affects the rate of conjunction fallacies iov AAB [15]. This was 
found using means and frequencies. A positive conditional dependency increases 
the frequency of conjunction errors. 

F12. Effect of event likelihoods, a) Highest frequency of conjunction errors 
occur with mixed HAL events, a lower frequency occurs with H AH events, and 
the lowest occurs with L A L events [33]. However, while the mean magnitude 
of the conjunction error is much larger with HAL events, no difference is 
found between L AL and H AH events [14] . b) The HAL items most often 
produce only a single conjunction error with the L event; the L A L event most 
often produce zero conjunction errors; and the HAH event produces both zero 
and double conjunction errors about equally often [34]. But the rate of double 
conjunction errors with H A H events is less than 50%, and they are not found 
using means [14]. The mean estimates for the results reported by Gavansky 
and Roskos-Ewoldsen (1991) were J{A) = .28, J{B) = .19, J{A AB) = .18 for 
the L A L condition; J{A) = .77, J{B) = .23, J{A A B) = .38 for the H A L 
condition; and J{A) = .76, J{B) = .69, J{AAB) = .67 for the HAH condition. 
The same general pattern is observed with disjunction errors - they are most 
common and largest in mean magnitude when one event has a low probability 
and the other has a high probability [15]. The mean estimates for the results 
reported by Fisk (2002) were J{A) = .36, J{B) = .14, J{A V B) = .27 for 
the L V L condition; J{A) = .73, J{B) = .23 , J{A V B) = .59 for the i? V L 
condition; and J{A) = .80, J{B) = .62 , J{AyB) = .75 for the Hy H condition. 

F13. Effect of event relationship. Some researchers find (a) differences be- 
tween related and imrelated items (Kahneman & Tversky, 1983), but (b) oth- 
ers find a smaller difference (Yates & Carlson, 1989) or no difference at all 
[14]. An unrelated type of example is to present a boring Bill story and a lib- 
eral Linda story, which produces J(Bill is an accountant and Linda is a bank 
teller) > J(Linda is a bank teller) as well as J(Bill plays jazz and Linda is a 
feminist) > J(Bill plays jazz). This was found using means and frequencies. 

F14. Relation to typicality ratings. Conjunction errors correlate with typi- 
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cality rating conjunction effects [35]. Same is true for disjunction errors [22]. 

F15. Response mode and order effects. Conjunction errors are more preva- 
lent witli ranking than ratings, but tliere is little or no difference between prob- 
ability and frequency ratings [17]. Apparently the early finding indicating that 
frequency formats reduce conjunction errors confounded class inclusion instruc- 
tions with ratings versus ranking responses [36]. Conjunction errors are larger 
in magnitude when the conjunction is rated first as opposed to being rated last 
[19]. This last result can be seen in Figure 1 comparing the circles with the 
solid dots. 

Facts 1-9 arc considered 'errors' with respect to the classic (Kolmogorov) 
probability theory. As pointed out by Tversky and Koehler (2004), these facts 
seem contrary to other general approaches to judgments of uncertainty including 
the theory of belief functions [37] as well as fuzzy set theory [38]. 

2 Classic Probability Theory 

Before presenting quantum probability theory, it is worth reviewing the basic 
assumptions of classic probability theory. This way we can directly compare the 
key assumptions underlying the two theories and see exactly where they differ. 
A great attraction of classic probabilistic models of cognition is that they are 
coherent, that is, predictions are derived from a small set of axioms [11]. But 
these models incorporate an important hidden assumption that may be overly 
restrictive for describing human judgments. 

Classic theory provides a set theoretic approach to probabilities: events are 
represented as subsets from a universal set (called the sample space). We will 
assume that the cardinality of the sample space is n (a large but finite number). 
In other words, the sample space contains n sample points, or unique outcomes 
(called elements). For this application, we can think each element, such as Ej, as 
representing a unique pattern of feature valu(^s. The story provides information 
that is used with prior knowledge to form a probability function, denoted p, 
which assigns a probability to each element. The classic probability assigned to 
a particular feature pattern is a positive real number denoted pj = p{Ej) > 0, 
and these probabilities must sum to one across all n elements in the universal 
set. A single question about event A is represented by a subset, denoted A', 
of the universal event composed of m < n elements. The event A' contains 
the subset of elements (feature patterns) that are true for the question about 
event A. The classic probability of this event equals the sum of the elementary 
probabilities in the subset: p{A') = 'YIe eA' Pj- negation of this event is 
the set complement {A) which has a probability p(^) = 1 — p(^')- 

Defining events as sets requires the events to satisfy a set closure property: 
If A' and B' are events from the sample space, then the union and intersections 
of these two are also events from the sample space. This brings us to represen- 
tations for questions about pairs of events. A question about the conjunction 
{AaB) is represented by the intersection of sets {A'HB'), and a question about 
the disjunction (^4 V B) is represented by the union of sets {A' U B'). However, 
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this requires making a crucial but hidden assumption caUed the compatibility 
assumption. It is assumed that the event B' used for question B is a, subset of 
the same sample space as the subset A' used for question A. In other words, 
different members from a common set of elementary events arc used to define 
A' as well as B'. Psychologically, a common set of features are used to describe 
both kinds of events. At first it may seem hard to imagine a situation where 
compatibility fails, but later we argue that this key assumption should not be 
taken for granted. Events defined as sets satisfy the commutative properties, 
{A' n B') = {B' n 40 and {A' U B') = (B^U A'), as well as the distributive 
property A' n {B' (J B) = {A' n B') U (.4' n B) of Boolean logic. 

Conditional probabilities are used to represent judgments about implica- 
tions [39]. Suppose event A' is assumed to be true. If A' is true, then a 
new conditional probability fimction pA is formed to update the elementary 
event probabilities as follows: If Ej G A' then pA{Ej) = p[Ej)/p[A') and 
zero otherwise, so that the sum of the conditional probabilities equals one. 
This new conditional probability function pA can then be used to determine 
new probabilities for other events from the same sample space. Based on 
this assumption, the conditional probability of event B' given event A' equals 
p{B'\A') = {Y.E,^A'nB'P3) lv{A') = p{A' n B')/p{A'). 

The probability of a positive response to the conjunction question requires 
yes to question A and a yes to question B, which equals the joint probability 
p{A' n B') = p{A') ■ p{B'\A'). A positive response to the disjunction question 
requires a yes to {A A B) or {A A ~B) or {~A A B). But a simpler way to answer 
the disjunction is to make a negative response if the answer to question A is no 
and the answer to question B is no, so that p{A' U B') ~ 1 — p{A fl B). The 
latter is particularly useful when more events arc involved and so we will use 
this form hereafter. The law of total probability, which is a key principle for 
Bayesian modeling, follows from the distributive law of Boolean logic: 

p{B') = p{B' n {A' U A)) = p{{B' n A') U {B' n A)) (1) 
= p{A' n B') + p(A n B') = p(A') • p{B'\A') + p{A) ■ p{B'\A). 

The above probability rules imply the following orders: 1 > p{H' U L') > 
piH') > p(i') > p(iJ'nL') > 0, andp(A'lB') > p(A'nB') = p(B') • p(A'|B'). 

The prime notation A' was introduced by Tversky and Koehler (1994) to 
distinguish questions about an event A from the corresponding mathematical set 
A' implied by the description. This is needed because two different descriptions 
could logically imply the same set, yet judgments may differ between the two 
logically equivalent descriptions. For similar reasons, different symbols are used 
to represent conjunctive and disjunctive questions (A, V) and the corresponding 
intersection and union relations (fl, U) used in classic probability theory. This is 
necessary because the logical relations implied by these symbols may obey dif- 
ferent observable properties. If we assume natural language conjunction (AaB) 
corresponds with intersection {A'nB') and natural language disjunction {AV B) 
corresponds with union {A' U B'), then facts 1-9 show that human judgments 
do not follow classic probability theory. One way to retain a classic probability 



9 



theory of human judgment in view of these facts is to assume that such direct 
and strict correspondences do not hold [40]. For example, one can assume that 
the conjunction question is answered using a conditional probability of the story 
given the event in question [41]. In other words, people misinterpret the ques- 
tions and judge the wrong probabilities. But this argument does not apply to 
studies that use betting procedures, which implicitly require likelihoods to make 
decisions, and never explicitly request a probability judgment. Another way to 
retain classic probability theory is to assume that each single probability judg- 
ment from an individual follows classic rules, but these judgments are based on 
noisy sample estimates contaminated by error [42] . Noisy probability estimates 
can produce highly frequent conjunction errors [43]. However, this cannot ex- 
plain violations of conjunction and disjunction rules when these violations occur 
with means and medians which cancel out the noise. 

3 Quantum Probability Theory 

First we will briefly summarize the basic assumptions of quantum probability 
theory. This summary has to be abstract so that we can compare only the 
essential and basic assumptions directly with classic probability theory. Later we 
elaborate with simple graphical and numerical examples and provide important 
psychological intuitions behind these ideas. ^ Quantum theory is comparable 
with classic probability theory in terms of it's coherence - it's predictions are 
also derived from a small set of axioms [45]. But quantum axioms differ from 
classic axioms, and it is an empirical question whether one or the other provides 
a better representation of human judgment. 

Quantum theory provides a geometric approach to probabilities: events are 
represented by subspaces of a vector space (called the Hilbert space). We will 
assume that the dimensionality of the vector space is n (again a large but finite 
number). In other words, the vector space is based on n orthogonal and imit 
length vectors (called eigenvectors). For this application, we can think of each 
eigenvector, denoted Vj, as representing a unique pattern of feature values. 
The story provides information that is used with prior knowledge to form a 
state vector, denoted ^ , which assigns a scalar (called an amplitude) to each 
eigenvector by the inner product Vt • = ipj. The quantum probability of 
a particular feature pattern equals the squared magnitude of its amplitude, 
q(Vj) = IV'jl^) a^iid these probabilities must sum to one across all n eigenvectors 
of the vector space (this is called Bern's rule). A single question about event A is 
represented by an m-dimcnsional subspace, denoted A" , within the vector space 
(to < n). The subspace A" is spanned by a subset of the eigenvectors (feature 
patterns) that are true for the question about the event A. The quantum 

^The reader only needs knowledge of linear algebra to understand this section. We realize 
that some readers may need reminders and so we included a brief tutorial in the appendix. 
No knowledge of physics is required. This application only uses the simplest and most basic 
ideas of quantum theory. See Hughes for a good non physical introduction to quantum theory 
[44]. 
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probability for this event equals the sum of the squared magnitudes of the 
amplitudes for the eigenvectors that span the subspace: q{A") = J2vjeA" IV'jl^- 
The negation of this event is the [n—m) dimensional subspace, denoted A-^, that 
is orthogonal to the subspace A", which has a probability q(A^) = 1 — 

Defining events as subspaces implies that the events must satisfy a subspace 
closure property: if vectors Vj and Vfe are members of the subspace, then 
W = aVj + feVfe, for arbitrary scalars a, 6 must also be a member. Conse- 
quently, one full set of eigenvectors {Vj, j = 1, n} can be 'rotated' by a unitary 
(orthonormal) matrix into another full set of eigenvectors {Wj , j = 1, n}. Thus 
there exists more than one set of eigenvectors that can be used to describe 
events within the same vector space. This brings us again to representations of 
questions about pairs of events. Suppose question A corresponds to subspace 
^" described by a subset of the eigenvectors; but suppose question B cor- 
responds to a subspace B" that cannot be described by these same features, 
and instead it requires a different subset of the Wj eigenvectors. Then the pair 
of events A" , i?" cannot be described by a common set of eigenvectors, which 
makes these two events incompatible. Psychologically, different kinds of features 
may be needed to describe the two different events. If event ^" can be defined 
by the same set of Vj eigenvectors as event i?" , that is they share a common 
set of eigenvectors, then these two events are compatible. Quantum theory re- 
quires a general representation of conjunction and disjunction that applies to 
both compatible and incompatible events. This is achieved by using a sequential 
logical operation to represent conjunction and disjunction questions [46].^ Sup- 
pose a question about event A is asked first followed by a question about event 
B. These questions are answered in order and the requested logical operation is 
performed on the answers. If asked about the conjunction in this order, then a 
positive response to the conjunction requires a yes to A followed by a yes to B, 
and this sequential logical and operation is denoted (^4" H B" ) . If asked about 
the disjunction in this order, then a negative response to the disjunction requires 
a no to ^ followed by a no to B. A positive response to the logical disjunction 
in this order is denoted (A" U B"). If the events are compatible, then the com- 
mutative property holds (yl" nS") = (B" nA") and (A" US") = (B" U.4"), and 
so does the distributive property A" n (B" UB^) = (A" n B" ) U (yl" UB^) (see 
Appendix). But if the events are incompatible, then both of these properties 
fail. Therefore, quantum events only obey a partial Boolean algebra [44]. 

Conditional quantum probabilities are used to represent judgments about 
implications. Suppose event ^" is assumed to be true, which is defined in terms 
of the Yj eigenvectors. If A" is true, then a new conditional state vector is 
formed which is defined as follows: If Vj e A" then the new amplitude assigned 
to Vj equals • i/;^ = '^pj/^yq{A") and zero otherwise, so that the sum of the 

^One might wonder if it makes sense to represent conjunction by the span of the intersec- 
tion of two sets of eigenvectors, and to represent disjunction by the span of the union of two 
sets of eigenvectors. There are two major objections for incompatible events. First, according 
to Bohr's principle of complementajrity, incompatible events cannot be evaluated simultane- 
ously, and they must be examined sequentially. Second, this fails empirically to explain the 
conjunction and disjunction fallacies. 
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conditional probabilities equals one (von Neumann called this state reduction). 
Now suppose we want to determine the probability of a new event B" , which is 
defined by the Wj eigenvectors. Then the probabilities for the new event B" 

. 2 

given ^" equals = Ew^.gb" ' (called Liider's rule). 

The probability of a positive response to a conjunction equals the probability 
of saying yes to the sequence of questions, q{A^ n B") ~ q{A") ■ q{B"\A"). 
The probability of a negative response to a disjunction equals q{A-^ n B-^) = 
q{A-^) ■q{B-^\A-^) and so the probability of a positive response to the disjimction 
is q{A" US") = 1 — q{A-^ n -B"""). If the events are compatible, then quantum 
probability obeys the same laws as classic probability (see Appendix), but if the 
events are incompatible they do not (see the examples below). 

In summary, the two probabilities theories share many similarities. Both 
provide principles for defining probabilities for single events, complements, con- 
junctions, disjunctions, and implications (conditional probabilities). However, 
the key differences are that classic probability represents events as sets, which 
forces all the events to be compatible so that they satisfy the commutative and 
distributive properties of Boolean algebra; whereas quantum theory represents 
events as subspaces, which allows events to be either compatible or incompat- 
ible, and the latter can violate the commutative and distributive properties of 
Boolean algebra. But this has been presented in a very abstract manner to 
compare basic assumptions, and next we give a more intuitive presentation of 
quantum theory. 



3.1 Simple illustration of quantum principles 

Figure 2 provides a simple illustration of all the ideas using only a three dimen- 
sional vector space. (In general, we do not necessarily assume such a simple 
space). It is most convenient to use the matrix algebra of projectors to do 
quantum calculations (using Matlab, or R, or Mathematica, ect.). 

The set of three orthogonal axes labeled {X, Y,Z} represent three differ- 
ent eigenvectors. For example, these three eigenvectors could represent three 
mutually exclusive and exhaustive responses for a voter such as democrat, re- 
publican, or independent (for our purposes, independent means not democrat 
or republican): 





"l" 




"0" 




"0" 


X = 





,Y = 


1 


,z- 

















1 



The quantum state tp is represented in this case by the vector associated with 
the letter S in the figure (e.g., the state of an undecided voter just before 
a presidential election). This state can be described in terms of coordinates 
with respect to the {X, Y, Z} eigenvectors. In this case, the state assigns the 
following amplitudes to the {X, Y, Z} eigenvectors 

■ip = S = (-.6963) • X + (.6963) • Y -|- (.1741) • Z. (2) 
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Figure 2: Three dimensional vector space with two sets of incompatible ques- 
tions 
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Dirac called vectors, such as Equation 2, superposition states with respect to 
the eigenvectors {X, Y, Z}. At this point, a superposition state simply assigns 
probabilities to events generated by {X, Y, Z}, but later this concept takes on 
a deeper meaning. As can be seen in this example, the amplitudes assigned to 
each eigenvector can be positive or negative (or even complex numbers). But 
note that the length of the state vector S equals one. 

Quantum probabilities are expressed most simply and intuitively by using 
the geometric concept of a projection (see Appendix). In this example, the state 
vector S lies in a three dimensional space. The events of interest are subspaces 
that have smaller dimension (rays or planes in this case). To determine the 
probability of an event, we first project the state vector on to the subspace 
that represents the event and then compute its squared length. A matrix is 
used to perform this projection, which is called the projector for the event. All 
the quantum events generated by the {X, Y, Z} eigenvectors are based on the 
following three projectors (formed by outer products): 

Mx = X • Xt, Mr = Y • Yt, Mz = Z- Z^ 

Mx + My + Mz = I {identity). 

For example, Mx is a 3 x 3 projector matrix (it has a one in the first row and 
column, and zeros everywhere else), and it projects the state S on to the ray 
X" containing the X eigenvector. The projection equals the matrix product of 
projector and the state, Mx ■ S. The probability of event X" (e.g., democrat) 
equals the square length of the projection \Mx ■ = | — .6963^ = .4848. 
The probability of event y"(e.g. republican) is computed in the same way, 
\My -81^ = |.6963|^ = .4848 . Suppose the quantum event in question is the 
plane formed by the span of eigenvectors {X. Y}, which is symbolized as AT" +1"" 
(e.g., 'democrat or republican'). The projection of the state S onto the X" +Y" 
plane is the vector associated with the label A. The matrix Mx+y = Mx+My 
projects the state vector S onto the subspace X" + Y" : 





"1 





0" 




"-.6963" 


A = Mx+y ■ S = 





1 





• s = 


.6963 



















and the probability of the A" + F" event equals |A|2 = | - .6963|2 + |.6963|^ = 
.9697. The negation of this disjunction event is the ray associated with the Z 
eigenvector (e.g., independent), and the probability of this event is \M(^x+y)-^'4'\'^ = 
|(J - Mx+y) ■ S|^ = \Mz ■ Sp = |.1741|2 = .0303 = 1 - .9697. 

If we were restricted to use only projections on the {X, Y, Z} eigenvectors, 
then quantum probabilities would obey the same laws as classic probabilities. 
However, a vector space has no privileged set of eigenvectors. We could rotate 
the first set {X, Y, Z} of eigenvectors to form a new orthonormal set of eigen- 
vectors labeled {U, V,W} in the figure. The unitary transformation matrix 
that generates coordinates for the new eigenvectors {U, V, W} is 
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1/^2 1/2 -1/2' 
1/v^ -1/2 1/2 
I/V2 



[U V w] 



The first column of T gives the coordinates of the U = T • X eigenvector, which 
is the ray that runs through the main diagonal of the X" + F" plane. The 
second and third columns of T give the V = T • Y and W = T • Z eigenvectors. 
This new set of eigenvectors {U, V, W} represents a different perspective for 
understanding features (e.g., moderate, liberal, conservative ). In this (artificial) 
example, the eigenvector U (e.g. moderate) lies in the X" + F" plane and 
happens to be midway between the eigenvectors X and Y (e.g. democrat, 
repubhcan). All the quantum events generated from the {U, V,W} set of 
eigenvectors are based on the following three projectors (again formed by outer 
products): 



Mcj = U • My = V • V^ Mw 

Mu + Mv + Mw = I. 



The state vector ijj = S can also be described in terms of the amplitudes assigned 
to the new eigenvectors {U, V, W}. The matrix product 





"ut" 




"ut • s" 









vt 


• s = 


vt • s 




-.5732 




wt 




wt • s 




.8194 



transforms the amplitudes originally assigned to eigenvectors {X, Y, Z} into 
amplitudes assigned to eigenvectors {U, V, W}. This allows us to express the 
state S as a superposition state with respect to the eigenvectors {U, V, W}. In 
other words, the exact same state S can be expressed as a superposition of the 
{X, Y, Z} eigenvectors or as a superposition of the {U, V, W} eigenvectors: 

(-.6963) • X + (.6963) • Y + (.1741) • Z = S (3) 
= / • S = {Mu + Mv + Mw) ■ S 

= u-ut-s + v-vt-s + w-wt-s 

= • U + (-.5732) • V + (.8194) • W. 

Now the concept of superposition becomes much deeper because the same state 
S must generate probabilities for t,wo different sets of eigenvectors. Shortly we 
will show how superposition states with respect to different sets of eigenvectors 
produce interference effects that are critical for explaining violations of classic 
probability theory. But first, let us continue with a few more example calcu- 
lations. The probability of the event W" (e.g. conservative) is determined by 
projecting the state S on to the eigenvector W, using the projector Mw, which 
produces the projection associated with the vector labeled B in the figure. The 
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squared length of this projection equals 



g(W') = |B|2 = \MwS\'^ 
= |W-Wt-S|2 
= |W|2 • |Wt • = 1 • |Wt • 
= |.8194|2 = .6714, 

which is simply the squared magnitude of the amplitude assigned to W in 
Equation 3. 

Finally consider the conditional probability of event W" given that event 
X" + y" has occurred (e.g., the probabiUty that the voter is conservative given 
that the person voted for a democrat or repubhcan). Following Liider's rule, 
we first compute the normalized projection of the original state S on the known 
event X" + Y" . Recall from above that A is the projection, and this is nor- 
malized to form the vector tpx+Y = A/ |A| = [-l/\/2 l/V^ O]^ . Then we 
compute the squared length of the projection of the state tpx+Y onto the ray 
1^", which equals = iMw-tpx+vf = \{W-W^ ■A/\A\)\'^ = .50. 

Similarly, the probabiHty q{W"\Z") = \Mw ■ = .50. 

A cognitive processing interpretation of the basic quantum principles can 
now be given by using the geometric concepts of states and projectors. The 
eigenvectors correspond to feature patterns that are used to describe or char- 
acterize events. The initial state vector ip represents the memory trace that 
determines the potential for a pattern to be retrieved, which is formed by the 
person's prior knowledge and the story told to the person. When questioned 
about a single event A, a projector Ma = J2^j ' is formed from the fea- 
tures (eigenvectors) {Vj, j G A"} representing the question A. The projection. 
Ma ■ ip, determines how well the retrieval cue provided by the question matches 
the memory state, and the probability of a retrieval equals the squared length 
of the projection: q{A") = \Ma ■ V'l^ = J2v eA" IV'jl^- If event A" is assumed 
to be true, then the initial state tp changes to a new state t/jA, which is the 
normalized projection ipA = {Ma ■ tp)/ \Ma ■ tp\ ■ After given this information, if 
a second question is asked about event B, then a projector Mb = Wj • Wj is 
formed from the features (eigenvectors) { Wj , ) G B" } representing the question 
B, and the conditional probability of a positive response to the retrieval cue B" 
after given information about A" equals q{B"\A") = \Mb • V'a|^- Finally, the 
probability of a positive response to the conjunction equals the probability of 
positive retrievals to both the first and second questions 



9(A" nS") =q(A") •g(B"|A") 
= \Ma ■ VI' • \Mb ■ 



= \ma ■ vr 



M 



B ■ 



(Ma ■ tP) 

\Ma ■ M 



(4) 



(5) 



= \MB-MA-tp\. 
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3.2 Interference 



The possibility of using different sets of eigenvectors, {X, Y, Z} versus {U 
,V,W}, within the same vector space to represent different types of questions 
introduces an important psychological issue about the disturbance or interfer- 
ence of one question by another. Suppose we ask about a question about event 
W (e.g. conservative). Before we ask this question, while the person is in the 
initial state S, there is a .6714 probability of answering yes. After obtaining 
this answer, the state changes (according to Liidcr's rule) from state S to state 
W. If we ask the same question again immediately, the person will answer yes 
with certainty (and with frustration for being asked to repeat the answer). The 
state W is no longer in superposition with respect to the {U, V, W} eigen- 
vectors. However, this same state W is in superposition with respect to the 
{X, Y, Z} eigenvectors. In other words, once a person becomes certain about 
the {U, V, W} set of eigenvectors, this person rmist become uncertain with re- 
spect to the {X, Y, Z} set of eigenvectors. The person can't be certain about 
both at the same time (Heisenberg called this the uncertainty principle). Fur- 
thermore, if we now ask a question about event Y (e.g., republican), then the 
probability of a yes equals .25 (given state W). If the answer to the Y question 
happens to be yes, then the state changes (according to Liider's rule again) from 
W to Y, and the person becomes certain about the Y question, but becomes 
uncertain again about the W question (probability of yes to question W given 
state Y is also .25). In other words, asking the question about Y after the 
answer to question W has changed the likelihood of responses about question 
W from certainty to uncertainty again. (This is the reason filler items are in- 
serted in between repetitions of a question). This type of disturbance between 
questions can always happen with superposition states described by different 
sets of eigenvectors within the same vector space. 

3.3 Compatibility of events 

According to quantum theory, order is usually critical, and one has to be careful 
of the order that questions are asked. For example, a projection on X" followed 
by a projection on J7"is not the same as these operations in reverse {Mx ■ Mjj ^ 
Mjj-Mx, i.e., the projection matrices do not commute). In other words, asking 
question X first (e.g. whether a person is a democrat or not) followed by asking 
question U (e.g., whether a person is a moderate or not) is not necessarily the 
same as asking these questions in the opposite order. This order effect indicates 
the property of incom,patibility between these two events (they do not share the 
same eigenvectors). Psychologically, one can only view one perspective at a 
time, the questions must be answered sequentially, and as we have seen, asking 
one question from one perspective can disturb a later question from a different 
perspective (for example, first asking about being a moderate can disturb a 
later question about being a democrat). There is an abundance of research 
demonstrating order effects on probability judgments [47]. For example, when 
judging probabilities of guilt in a criminal trial, the direction of the effect of 
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weak evidence on judgments depends on whether it precedes or foUows strong 
evidence [48] . These order effects are inconsistent with classic probabihty theory, 
and in the past, they have been explained in terms of anchoring and adjustment 
type of adding or averaging models. 

This capability of changing eigenvectors (i.e. changing perspectives) and 
producing incompatible events makes quantum theory fundamentally different 
than classic theory. Classic (Kolmogorov) probability theory assumes a sin- 
gle compatible representation of events. Figure 2 was constructed assuming 
that quantum events X",Y",Z" (e.g., democrat, republican, independent) are 
incompatible with the quantum events [/" , V" , W" (e.g., moderate, liberal, con- 
servative). In this case, one set of events is a rotation of the other set of events 
within the same three dimensional space as depicted in Figure 2. To evaluate a 
question about the X event, we need to adopt the {X, Y, Z} eigenvector point 
of view; but then to evaluate a question about the U event, we need to rotate to 
the {U, V, W} eigenvector perspective. One cannot evaluate questions about 
X and U simultaneously (Bohr called this the principle of complementarity). 

Alternatively, note that whenever we ask questions using eigenvectors from 
the same set, then the order does not matter. For example, Mx-My = MyMx 
so X" (e.g. is a person a democrat) is compatible with F" (e.g., is a person 
a republican). This lack of order effect defines the property of compatibility 
between these two events (they share the same eigenvectors). In this case one 
can maintain the same perspective while answering both questions. In this way, 
the questions can be answered simultaneously, because one question does not 
disturb the other. In this case the two events X" , F" are also mutually exclusive 
(i.e., orthogonal subspaces), and so are the two events U^\V" . In general, if 
two events Ma, Mb are orthogonal to each other, then they are compatible 
because Ma ■ Mb = = Mb ■ Ma- However, it is possible that two events 
can be compatible yet not orthogonal. For example, Mu+v ■ My+w = My = 
My+w ■ Mu+v and so [/" -|- F" (e.g., is a person a moderate or liberal) is 
compatible with V" + M^" (e.g., a person is liberal or conservative). 

Now let us turn and examine the geometric situation used to represent events 
when they are all compatible. Once again suppose that {X" , Y" , Z" } represent 
three mutually exclusive and exhaustive quantum events (e.g., a person is a 
democrat, republican, independent); and suppose that {Q", i?", S"'} is a differ- 
ent set of three mutually exclusive and exhaustive quantum events (e.g., a person 
is young, middle age, or old). As before, the events in are not 

necessarily orthogonal to the events in {Q", i?", S"'}, but now we assume that 
the events in {X" , F" , Z" } are compatible with the events in {Q" , i?" , S"' }. This 
implies not only that Mx ■ My = My ■ Mx but also that Mx ■ Mq = Mq ■ Mx, 
and this is true for all pairs of events. Now it is impossible to represent all 
these events by Figure 2, because all of these compatible properties cannot oc- 
cur within a 3 dimensional space. These compatible events require (at least) 
a 9-dimensional vector space (see Appendix) based on 9 orthonormal eigenvec- 
tors {XQ,XR,XS,YQ,YR,YS,ZQ,ZR,ZS}, which is forms a tensor prod- 
uct space. In this 9-dimensional vector space, the single ray or eigenvector XQ 
represents the pattern or joint event X" fi Q" (e.g. democrat and young), and 
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the amplitude assigned to the XQ eigenvector determines the joint probabihty 
of X" n Q". Using this representation, the event X" = XQ" + XR" + XS" 
(e.g., democrat) corresponds to the projector Mx = {Mxq + Mxr + Mxs), 
the event Q" = XQ" + FQ" + ZQ" (e.g., young) corresponds to the projector 
Mq = {Mxq + Myq + Mzq), the intersection event (X" 0(3")= XQ" corre- 
sponds to the projector MxnQ = Mx ■ Mq = Mq ■ Mx = Mxq = XQ • XQ'^, 
and the span X" + Q" corresponds to Mx+q = Mx + Mq - Mq ■ Mx- If all of 
the events are compatible, then the probabilities computed from quantum the- 
ory obey the same laws as the probabilities computed from classic (Kolmogorov) 
theory (see Appendix). 

In summary, if events {X" , Y" , Z" } and events {U" , V" , W" } are incompat- 
ible, then the person can only respond with one of three possible outcomes at 
any point in time. The person can choose a response from the set {X" , Y" , Z"}, 
or the person can choose a response from the set {U" ,V" ,W"}, but we can- 
not observe any combinations. So this situation can be represented within a 
3-dimcnsional space. But when the events in {X",Y",Z"} and {Q",i?",5"'} 
are compatible, then a person can respond with a pair, one from each set, which 
means one of 9 possible outcomes can occur. So we need to use at least a 9 - 
dimensional space to represent this situation. These are the smallest possible 
dimensions that could be used for these examples, and in general, the dimen- 
sionality could be much larger in both cases. 

Of course it is possible to have a combination of compatible and incompat- 
ible events. For example, suppose we had three sets of questions: a first set of 
mutually exclusive and exhaustive events {X" ,Y" , Z"}, a second set of mutu- 
ally exclusive and exhaustive events {U" , V" , W"}, and a third set of mutually 
exclusive and exhaustive events {Q" , R" ,S"}. Again we suppose that a question 
taken from one set is not orthogonal to a question taken from a different set. 
In this situation it is possible, for example, to have the first and second set be 
incompatible with each other, but both could be compatible with the third set. 
This situation would require at least a 9-dimensional space. This vector space 
would be spanned by 9 eigenvectors formed from combinations of the first and 
third sets, or it would be spanned by 9 eigenvectors formed from combinations 
of the second and third sets; furthermore the two sets of eigenvectors would be 
related by a unitary transformation. 

When should events be treated as compatible or incompatible? The general 
answer is that this is an empirical question, and order effects are an empirical 
sign of incompatibility. However, at this point we make the working hypoth- 
esis that compatibility depends on experience with the combination of events. 
Conjunction errors disappear when individuals are given direct training expe- 
rience with pairs of events [49], and order effects on abductive inference also 
decrease with training experience [50]. On the one hand, if the person has a 
great deal of experience with the combination or pattern of events, then they 
have the opportunity to form a compatible vector space, and they can estimate 
the intersection of events from this large space of patterns of events. On the 
other hand, if an unusual or novel combination of events is presented, and the 
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person has little or no experience with such combinations, then they may not 
have formed a compatible representation, and they must rely on incompatible 
representations of events that use the same small vector space but require taking 
different perspectives. A second way to facilitate the formation of a compat- 
ible representation is to present the required joint frequency information in a 
tabular format [51]. Instructions to use a joint frequency table format would 
encourage a person to form and make use of a compatible representation that 
assigns amplitudes to the cells of the joint frequency tables. 

3.4 Violations of commutative and distributive properties 

Quantum probabilities for sequential conjunctions violate the commutative prop- 
erty. For example, referring to Figure 2, consider the quantum probability for 
conjunctive questions about events X (e.g., democrat) and U (e.g., moderate) 
again. The probability of agreeing to both when question X is queried first and 
question U is asked second equals q{X'" n [/") = \MuMx ■ S|^ = .2424, and the 
probability of yes to both in the opposite order is q{W^ FIX") = \MxMu ■ S|^ = 
0. This dramatic change in order happens in this case for the following reason. 
The initial state S for the individual shown in the figure is orthogonal to the 
vector U. If this individual is initially asked about question U (e.g., are you a 
moderate?), then there is zero probability of answering yes to this first question 
(e.g., a person who likes to take a strong stand on issues), and so the conjunc- 
tive probability is also zero. However, if the individual is initially asked about 
the question X, then the initial state S is negatively correlated to the vector X 
(e.g. democrat), and its squared magnitude makes a reasonable probability of 
saying yes and transiting from the S to the X state; furthermore the X state 
(e.g. democrat) is positively correlated to the U state (e.g., moderate), which 
then makes it possible to transfer from X to U and answer yes to the second 
question as well. In fact, it is well known that survey responses can be manip- 
ulated by order [52], and similar 'chaining' effects are found in categorization 
[53]. Quantum probabilities for disjunctions also violate the commutative prop- 
erty. For example, consider once again Figure 2. The quantum probability for 
the disjunction question {X V U) assuming that question X is processed first 
equals g(X" U t/") = 1 - \Mjj±Mx^ ■ = .7273, and for the other order it is 
g([/" U X") = 1 - \Mx^Mu±- ■ = .4848. These differ because g([/") = for 
the latter order. 

There is considerable direct evidence for order effects on the conjunctive 
fallacy. In the first experiment of Gavansky and Roskos-Ewoldsen (1991), par- 
ticipants rated the individual constituents before rating the conjunction (pro- 
ducing the circles in Figure 1), and in the second experiment the conjunction 
was rated first (producing the dots in Figure 1). As can be seen, rating the 
conjunction first produced a larger magnitude conjunction error. These results 
were replicated using random assignment to two groups within a single study 
by Stolarz-Fantino et al. (2003, Exp 2). When the conjunction came first, the 
mean probability rating for the conjunction equaled .26 as compared to a mean 
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of .18 for the low event, and 57% of the participants produced the error; but 
for the opposite order the mean rating for the conjunction was .16 as compared 
to a mean of .14 for the low likelihood event, and only 31% of the participants 
produced the error. 

The law of total probability is fundamental to Bayesian theory, but accord- 
ing to quantum theory, it fails when incompatible events ever are involved. To 
sec how and why this happens, wc return to Figure 2. Consider the probability 
for a question about event W (e.g., whether or not a person is a conservative). 
According to classic probability theory, a positive response to this question can 
happen two mutually exclusive and exhaustive ways: the person is an indepen- 
dent and a conservative (Z' D W), or the person is not an independent and a 
conservative {Z (1 W). So the total probability that a person is a conservative 
equals p(VF') = p((Z' UZ)r\W) = p(Z') • p{W'\Z') + p(Z) ■ p(T4^'|^). Now 
let us reconsider the quantum probabilities that we computed earlier for these 
events using Figure 2. When we first asked a question about Z and then asked 
about W, recall that we found q{Z") = .0303 and q{W"\Z-^) = qiW\Z") = .50, 
and so the total probability is q{{Z" H W")U {Z^ n W")) = q{Z") ■ q{W" \Z") + 
q{Z-^)q{W"\Z-^) = .50. But if we directly ask a person a question about event 
W, then wc foimd earlier that q{W'') = q{{Z'' U Z-^) nW") = .6714, which 
violates the law of total probability! The reason that this happened is because 
the initial state S is very similar to the ray W" , but the initial state S is very 
dissimilar to the ray Z" which must be reached first by one of the two indirect 
routes from S passing through Z" or Z-^ to W^' . Violations of the law of total 
probability have in fact been reported in some of earlier research [54]. This 
violation of the law of total probability by quantum theory will turn out to be 
one of the key ideas to explain the fallacies reviewed earlier. This only happens 
when events are incompatible. 

What determines the order for incompatible questions? This is an important 
empirical issue. A working hypothesis is that when the individual events differ 
greatly in terms of their likelihoods (e.g. for the Linda story, the event feminist 
is very likely whereas the event bank teller is very unlikely), then people start 
with the higher probability event. For the conjunction question {H A L) this 
implies using the (H" Fl L" ) conjunctive sequence. For example, when asked the 
conjunction question regarding the Linda story, we assume that the feminist 
event is processed before the bank teller event. But for the disjunction question 
{HW L), the relevant conjunction question that needs to be considered is {~H A 
~L), and ~L is more likely than ~H. So the 'start with the higher probability' 
principle implies using the conjunctive sequence (L-^nH-^), which implies using 
the disjunctive sequence {L" U H"). For example, when asked the disjunction 
question regarding the Linda story, we assume that the not-bank teller event is 
processed before the not-feminist event. Another factor that determines order 
of processing is a cause - effect relation, i.e., if C is the cause and E is the 
effect, then we assume (C" n E"). For example, when given the 'increase tax 
and reduce smoking problem', we assume that the 'tax' cause is processed first. 
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4 Quantum Explanation of Judgment 'Errors' 



The quantum model is essentially a similarity based approach to probability, 
where similarity is determined by inner products of vectors in a multidimen- 
sional space. Thus it is quite consistent with the finding that typicality rating 
conjimction eflPects are highly correlated with conjmiction errors (Fact 14). In 
fact, it has already proved to be highly successful for modeling typicality ratings 
for conjunctive and disjunctive concepts [6]. But how do conjunction and dis- 
junction errors arise in the first place? We now turn to these more challenging 
questions. 

4.1 Conjunction error and its moderators 

Let us first consider a single conjunction fallacy (Fact 1). The state vector tp 
represents the memory state of the individual after reading the story (which 
is based on both prior knowledge together with details about the story). The 
projector, Mh serves as a retrieval cue for retrieving features related to the 
question about event H (feminist); and similarly, the projector serves as 
the retrieval cue for questions about event L (bank teller). Thus Mh projects 
the Linda state tp onto the high likelihood image of feminist, and projects 
the Linda state ip onto the low likelihood image of bank teller. According to 
the 'start with the higher probability' rule, the probability for the sequential 
conjunction is q{H" Fl L") = \Ml ■ Mh • V"!^: and the probability for the single 
event is q{L") = {M^ipl^ . So how can we (the theorists) tell whether or not the 
fallacy occurs? To do this, we (the theorists, not the judge) need to express 
the single event probability in terms of the conjunction probabilities using the 
quantum rules (see Appendix for details): 

q{U^) = \MLijf = \ML-I-iPf (6) 

= \ML-{MH + MH±)-^pf 

= q{H" n L") + q{H^ n L") + ItHl 

IntL = 2 • Rc[{MLMH±^p)HMLMHij)]. (7) 

Notice that the quantum probability (Equation 6) almost looks like the law of 
total probability (Equation 1), except for the interference term, Inti, (associated 
with event i"), which can be positive, negative, or zero. This interference is 
the same mathematical concept that is used to explain the classic two hole 
experiment with photons in physics [55]. If the interference term is zero, then 
quantum probabilities satisfy the law of total probability and no conjunction 
error occurs. Thus the model allows some people to be consistent with classic 
probability theory. In particular, if Mh and Ml are compatible, then this 
interference term is exactly zero (see Appendix). Thus interference only occurs 
with incompatible events, and this explains why conjunction errors are robust 
for questions about unrelated events Hi and L2 concerning different people (Fact 
13b). For this is exactly a situation in which it is unlikely that a person has 
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sufficient experience to form a compatible representation, and must represent 
the situation with incompatible events that interfere. 

To produce the conjunction 'error' we require IntL < —q{H-^nL") < 0, and 
because q{H^ Hi") > 0, this implies that the interference must be sufficiently 
negative to produce a conjunction error. This last result explains the fact the 
conjunction errors occur more frequently with questions about mixed H and L 
events (Fact 12): q{H^ Hi") must be small to produce the conjunction fallacy. 
Note that if is a question about a high likelihood event, then is a low 
likelihood quantum event, and L" is also a low likelihood quantum event, which 
makes q{H^ H i") small, and so only a small negative amount of interference 
is needed. This does not happen for the low -low case (because q{L^ n L2") 
has one high component), or the high -high case (because q{IIi n H2') has one 
high component), and so the interference may be insufficient to produce the 
conjunction error in these cases. In fact, the size of the conjunction error is 
bounded by the difference between q{H") > q{H" r\L") > q{L"), and it shrinks 
to zero if (7(ff") = q{L^^) (sec Appendix). This in fact matches the results 
shown in Figure 1. There it can be seen that the conjunction error is present 
only for mixed H and L events on the left wall, and it is absent for events on the 
diagonal of the X-Y plane, where q{A) is almost equal to q{B). Furthermore, 
consistent with Fact 12, only single conjunction errors are predicted to occur 
in the high-low case, because the interference effect is only produced for the L" 
quantum event when sequentially processed in the (iJ" □ L" ) order (see a later 
section for the double conjunction error issue). 

How do we psychologically interpret this interference effect? Consider, for 
example, Figure 2 once again. Suppose we compare the conjunction probabilities 
q{X" n U") and q{X-^ n [/") with the probability of the single event q{U") 
given state S in the figure. These calculations produce the following answers: 
g(X" n ;7") = \MuMx ■ = .2424 and q{X-^ n f/") = \Mu{My + Mz) ■ 
= .2424 but g([/") = |Mj/ • S|^ = and so Intu = -.4848. The first term, 
(/(X" n [/"), is positive because S is negatively correlated with X, and X is 
positively correlated with U in the figure, and so the squared magnitude is 
positive. The second term is positive for the same kind of reasoning. But S is 
orthogonal to U in the figure. The psychological intuition behind this math is 
the following while it is possible to reach the conclusion U by way of thinking 
first about X from state S, it is impossible to reach this conclusion directly 
from state S. In other words, the indirect line of thought S — > X — > U has 
a reasonable possibility even though there is no chance from the direct route 
S — » U. You cannot see the conclusion U directly from state S; but the indirect 
route (produced by asking about question X first) puts you in a state that makes 
you think of something different, which then opens the possibility of reaching 
a conclusion favoring yes to question U . For the Linda story, the judge cannot 
directly imagine Linda as a bank teller; but if the judge first thinks about her as 
a feminist, and then imagines her as a bank teller from this new feminist point of 
view, it now seems more possible that she could be a bank teller. This quantum 
explanation relates to both the availability and representativeness heuristics. 
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The representativeness heuristic comes into play when matching the story to 
each question in terms of similarity, and the availability heuristic comes into 
play when one question acts as a retrieval cue redirecting thinking toward a 
different point of view. 

The interference term can also be expressed as IntL = q{L" ) — [q{H" nL") + 
q{H-^ n L")]. The first term, q{L"), is the probability of reaching a conclusion 
from a direct route (initial state to conclusion). The bracketed term is the 
probability of reaching the same conclusion summed across all indirect routes 
(through an incompatible set of eigenstates) to that conclusion. Thus IntL is a 
quantity that we (the theorist) derive to express the difference in probabilities 
caused by traveling the direct route versus traveling a set of indirect routes, and 
different interference terms can be derived depending on which set of indirect 
routes we compare to the direct route. When the interference term is negative, 
that means that the indirect routes have a greater chance of reaching the con- 
clusion; and when the interference term is positive, that means that the direct 
route has a greater chance of reaching the conclusion. The interference term can 
be directly estimated from experiments that request all three judgments J {A), 
J{A A B), J{A A ~B)7 This procedure was carried out in the study by Wedell 
and Moro (2008), and using the data reported in Table 2 from that article, we 
obtain the following interference estimates: —.36 for the dice problem, and —.55 
for the urn problem. The calculation of the interference effect, —.4848, based 
on Figure 2 is an example of a conjunction 'fallacy' produced simply by using 
the inner products (similarities) between vectors in the figure, and more exact 
results can be obtained by adjusting these inner products. The inner products 
between vectors are the key parameters for making exact predictions, and the 
model could be fit to judgments using some type of multidimensional sealing 
algorithm. (This would also require a more sophisticated response model.) 

Note that Re[{MLMH^0 ■ (MlMhiP)] is the real part of the inner product 
between two vectors, MlMh±iP and M^MhiI^- The first vector, MlMh^xP is 
the projection of the state i() (produced by the story) first on the H-^ subspace 
and then on to the L" subspace; the second vector is the projection of the same 
state ^ (created by the story) now on the i?" subspace and then again on the L" 
subspace. For the Linda story, MsMp^tl) captures the features that match the 
Linda story with a type of person who is first considered not to be a feminist 
and then considered also to be a bank teller; MbMfiP captures the features 
that match the Linda story with a type of person who is first considered to be 
a feminist and then again considered also to be a bank teller. Recall that an 
inner product is like a correlation: If these two vectors match or are similar, 
then the inner product will be positive; but if these two vectors mismatch or 
are dissimilar, then the inner product will be negative; and if the two vectors 
are unrelated or orthogonal, then the inner product will be zero. Although not 
many features match between the Linda story and a person who is not feminist 
and a bank teller; those that do match are likely to have some negative relation 
to those that match a person who is a feminist and a bank teller, resulting in 

^This method requires strong measurement assumptions for tlie judgment response. 
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a negative inner product and producing negative interference. More generally, 
the relation between the features of the quantum events H" and L" , as well as 
their match to the story, are important for determining the size and direction 
of interference. This is important for explaining Facts 10,11. The interference 
depends on the inner product of projections on event subspaces, and this inner 
product provides a principled way to understand the effects of semantics and 
interdependence of events on conjunction errors. This inner product also allows 
for effects of relationship between events that are sometimes found (but not 
necessary) for conjunction errors (Fact 13a). 

A similar analysis applies to the studies of the conjunction fallacy that em- 
ploy cause - effect type of events. For example, suppose the two quantum 
events are C" (e.g., 'increase cigarette tax') and E" (e.g. 'reduce smoking'). 
The 'cause first' order principle specifies that the prediction for the conjunction 
is q{C" n E"), and the prediction for the single event E" is 

q{E") = IMEipf = \Me-I- = \Me ■ {Mc + M^^) • (8) 
= 5(C" nE'') + q{C^ nE")+ IntE- 

With negative interference produced by the sequential conjunctive judgment, 
IntE < — q{C-^ n £"'), the quantum probabilities again produce a conjunc- 
tion fallacy g(C" n £") > q{E"). Again the psychological intuition is the 
following. From the initial state, it is hard to imagine why teenage smoking 
should decrease; but it is not hard to imagine a tax increase on cigarettes, 
and once you imagine that, it is not hard to imagine a drop in teenage smok- 
ing. If there is a strong causal relation, then q{C" n £"') = g(C") • q{E"\C") 
is large (because qiE"\C") is large) and q{C^ H E") = q{C^) ■ q{E"\C^) is 
small (because q{E"\C-^) is small), and the conjunction fallacy is more likely 
to occur. A positive conditional dependency between the cause and effect in- 
creases the joint probability q{C" □ £"') and decreases the joint probability 
q{C-^ n E"), which agrees with Fact 11. The interference in this case equals 
IntE = Re[(M£;Mc'i'0)^ ' {MeMciP)]- This means that the inner product must 
be negative between (a) the projection first on the cause absent followed by the 
effect, and (b) the projection first on the cause present followed by the effect. 
In other words, the features produced by situations associated with the cause 
absent and effect present are negatively correlated with the features associated 
with the cause present and the effect present. 

It is time to address the issue of double conjunction errors. Double con- 
junction errors occur more frequently for conjunctions that contain two highly 
likely constituents. However, as can be seen in Figure 1, double conjunction 
errors are not found using means even for H A H events. Quantum theory can 
only produce zero or single conjunction errors. If q{A^^) > q{B^^), then a single 
conjunction error, g(vl" □ B") > q{B"), is possible (see Appendix). Double 
conjunction errors obtained from a single rank ordering of a list of events can 
be interpreted in one of two ways. First, they may simply be the result of judg- 
ment error. This is a likely explanation for two reasons. One is that they do not 
occur with the means after averaging out the error. Second, chance errors from 
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a single rank ordering of a list of events arc very likely when the event proba- 
bilities are nearly equal. In particular, if one assumes that people correctly use 
the multiplicative rules of classic probability theory, but base these calculations 
on noisy probability estimates, then more frequent conjunction errors arc pre- 
dicted to occur by chance for the HAH case [43]. A second, and possibly more 
interesting reason, is that double conjunction errors may reflect an unusual sit- 
uation in which the formation of an entirely new unitized or configural concept 
emerges. More formally, a new subspace AB" is formed that corresponds to a 
projector which cannot be decomposed into a product of the two projectors for 
the subspaccs, A" , B" . The quantum concept of entanglement has been used to 
describe this new type of configuration [6]. 

4.2 Disjunction errors and unpacking effects 

Next let us consider the disjunction fallacy (Fact 2). Once again ip is the memory 
state following the Linda story, Mh is the retrieval cue or projector for feminist, 
and Ml is the retrieval cue or projector for bank teller. The quantum probability 
of the single event is (/(-ff") = 1 — q{H^) and the quantum probability for the 
disjunction is q{L" U ff") = 1 — q{L^ n H^). Therefore, the disjunction fallacy 
requires q{H'') > g(L" U H") ^ q{H^) < q{L^ n H^). We (the theorists) 
can compare these predictions by expanding q{H-^) like we did for g'(i") in 
Equation 6: 



Using this result, we find that we require negative interference again, IntH± < 
—q{L" n H-^), to produce the disjunction effect. As before we expect a single 
conjunction error when one event is high (in this case it is i^), and one event 
is low (in this case it is H-^). The psychological intuition in this case is the 
following. The disjunction effect occurs when q{L-^nH-^) becomes exaggerated, 
and this happens because it is easy to think of Linda not being a bank teller 
(which leads one to say no), and once you start thinking about bank tellers, it 
becomes harder to think about Linda as a feminist (which again leads one to 
say no). But saying no to both of these questions leads to the conclusion that 
the disjunction is false. 

For example, consider once again Figure 2. Suppose we compare q{X" ) with 
q{U" U X"). From our earlier calculations, we found that q{X) = \MxSf = 
.4848. For the sequential disjunction we obtain q{U" U X") = 1 - g(C/^ n X^) 
= 1 - \My+z ■ Mv+w ■ Sf = .4848. Thus we find q{X") = q{U" U X"), which 
is still a disjunction error because using the relations implied by Figure 2, the 
probability of a yes to question about (C/ A ~X) is strictly positive. So according 
to classic probabihty, if p{U' D X) >_0 then p{U' U xj = p{U' f] X') + p{U n 
X') + p{U' nX) > p{U' n X') + p{U n X') = p{X'). Thus classic probability 
requires p{U' U X') > p{X') with strictly greater inequality in this example. 



IntH^ 



\Mh± ■ Vr = \Mh± ■ (Ml + Ml±)- VI' 
g(L" n iJ-L) + q{L-^ nH^)+ IntH-L , 
Re[{MH±ML^y ■ {MH±ML±i>)]. 



(9) 
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The real challenge is to explain Fact 3 in which both conjunction and disjunc- 
tion fallacies occur within the same person and set of questions. This requires 
IntL < —q{H-^ n L") and Intui. < —q{L" n H-^), and these constraints need 
to be checked for feasibility. In the appendix, we show that this set of con- 
straints requires H H") = \MhMl^I^\'^ < \MlMh^\'^ = q{H" H L") which 
is consistent with the theory when the events are incompatible. Psychologically 
speaking, processing the high event first must facilitate retrieving a positive 
conclusion to the conjunction more than processing the low event first. As we 
have seen above, the sequential conjunction depends on the order. 

The containment fallacy (Fact 4) can be explained using either Equation 6 
or 9, but it is more natural to use the former because each question is actually 
about a single event. When shown a ski photo and asked to the judge the 
likelihood that it came from Switzerland (question S), the person answers yes 
to this event directly with quantum probability q{S). Similarly, when shown the 
ski photo and asked about the likelihood that it came from Europe (question 
E), the person answers yes with quantum probability q{E). To compare these 
two probabilities, we (the theorists) need to express q{E) in terms of q{S) as 
follows: 

q{E) = \MeM'' = 1Mb- J- VI' 
= \MEiMs + Ms±) ■ i^f 
= q{S" n £") + q{S^ n £") + IntE 
= g(5") • q{E"\S") + q{S-^ n £") + Ints 
= g(S")- 1 + 9(5^ nE")+/ntB 
= g(5") + g(5^n£")+/ntB. 

Once again we require negative interference IntE < —q{S'^ n E") to produce 
the containment effect. The direct path from the state (produced by the ski 
picture) to a positive conclusion about question E (from Europe) is low, but 
the indirect path from S (from Switzerland) and then to E (from Europe) is 
very high, and so the interference is negative. This also requires us to assume 
that people are using incompatible representations of these two events, even 
though one question is about a subgroup of a larger group referred to in the 
other question. This maybe a way of formalizing the gist concept used in fuzzy 
trace theory to explain 'class inclusion' illusions [56]. 

Now consider unpacking effects (Fact 5). These effects can also be described 
by interference between incompatible events [57] . The initial finding by Rotten- 
striech and Tversky (1997) was that unpacking an event D (death from murder) 
into a question about a likely event S (killed by a stranger) and another event 
(~ S killed by an acquaintance) increases the judged probability when compared 
to the packed event. This finding was explained by availability and formally 
incorporated as an assumption into support theory, but quantum theory derives 
the effect using the same line of reasoning as used for the conjunction error. 
First consider the judgment for the packed quantum event D" which we (the 
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theorist not the judge) expand in the same way as we did in Equation 6: 

q{D^') = \Mn ■ {Ms + Ms±) ■ iPf 

= q{S" nD") + q{S^ □£>") + Into 

= 9(5") ■ q{D" |5") + q{S^) ■ q{D^'\S^) + Into 

= q{ff')-l + q{S^)-l+IntD 

= q{S") + q{S^) + IntD. 

The judgment for the impUcit impacked event is described by q{S^') + q{S'^). 
In this case, the direct path to the conclusion for the paclted event has a lower 
probability than the sum of the indirect paths from the unpacked events, pro- 
ducing negative interference: Into < 2 • Rc[{MDMs±'ijj)^MDMs'ilj)] < 0. The 
negative interference implies that the projection of the initial state first onto an 
acquaintance and then onto death is negatively correlated with the projection of 
the initial state first onto stranger and then onto death. This quantum interfer- 
ence explanation provides an alternative to support theory for mathematically 
representing the effects of availability. The later finding by Sloman et al. (2004) 
found that unpacking an event D (death from disease) into a question about a 
low likelihood event {N death from pneumonia) and a residual {N-^ diabetes, 
cirrhosis, and other diseases) reduces the judged probability compared to the 
packed event. The quantum model agrees with the intuition provided by Sloman 
et al. (2004) that when using an unlikely unpacked event and a residual, the 
indirect paths produced by unpacking make it difficult to reach the conclusion, 
and now it is easier to reach the conclusion directly from the unpacked event. 
Although the latter find is contrary to the formalism of support theory, this is 
still consistent with the quantum formalism, but now it produces positive inter- 
ference: Into = 2-Re[(M£iM^iV-')^(A^D^^Ar?/')] > 0. The positive interference 
implies that the projection of the initial state first onto pneumonia and then on 
to death is positively correlated with the projection of the initial state first on to 
the residual (diabetes, cirrhosis, etc.) and then onto death. Although support 
theory fails, the quantum model provides a mathematically consistent way to 
formalize this interference effect using positive or negative inner products. 

There are at least two ways to explain the partitioning effect (Fact 6) using 
the quantum model. One is to use interference as we did with the implicit 
unpacking effect. However, a more convincing way is to use a quantum analogue 
of Fox and Rottenstreich's (2003) 'ignorance prior' (which can also be applied 
to the implicit unpacking effect.) The original idea was based on the use of 
a classical probability function p that assigns equal prior probabilities to each 
alternative under consideration. Thus a focal event receives greater probability 
in the case based representation (with only one other comprehensive alternative) 
as compared to the class based partition (with the comprehensive event broken 
down into several alternatives). The quantum analogue uses a state vector -0 
that assigns initial amplitudes of equal magnitude to each alternative under 
consideration. This results in the same 'ignorance prior' effect by assigning a 
larger quantum probability to the focal event in the case based partition as 
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compared to the class based partition. 

4.3 Averaging error, conditional fallacy, and inverse fal- 
lacy 

The averaging phenomena (Fact 9) easily can be explained by the quantum 
model. This finding implies the that following inequalities are satisfied: 

q{L) = q{M H L) + q{M^ n L) + IntL < q{M n L) 
q{H) > q{HnM). 

This pair of inequalities follows directly from the earlier analyses. The first 
inequality is satisfied as long as the interference, IntL, is sufficiently negative to 
produce a conjunction fallacy, and the second inequality is always true for the 
quantum model when the high likelihood event is processed first. 

Now let us turn to Fact 7, the conditional fallacy. According to quan- 
tum theory, the implication J{H i— > L) is represented by Liider's rule. But 
Liidcr's rule (like Baycs rule) cannot produce a conditional fallacy because 
q{H" n L") = q{H") ■ q{L"\H") > q{L\H"). However, there is a simple alterna- 
tive quantum explanation for this fallacy. Recall that Liider's rule assumes that 
the judge first makes a transition from the initial state (based on the story) 
to a state consistent with the antecedent of the implication, tp — » tpjj, before 
determining the probability of the consequent of the implication. This projec- 
tion only takes place if the judge attends to the antecedent and accepts it as 
true. Otherwise the projection fails to take place and the probability is based 
on the projection of initial state ip onto the consequent of the implication, which 
implies that the judgment for the implication is based simply on q{U^). Finally, 
if there is negative interference, then we obtain g(L") < q{H" Fl L"). This ex- 
planation agrees well with the findings of Miyamoto et al. (1998) who found 
J(thc temperature remains below 38°-F) ~ J(if it rains then the temperature 
remains below 38° F) < J{ it rains and the temperature remains below 38° F). 
At the same time, it can accommodate the findings of Tversky and Kahneman 
(1983) by assuming that for this medical story, the truth of the antecedent 'age 
is over 50' was attended and accepted as true, and a projection did occur before 
determining the probability of the 'heart attack' event. 

A quantum explanation for the inverse 'fallacy' is based on the idea that some 
questions may be represented by simple rays (i.e., one dimensional vectors). 
Consider two different questions, one is about an event represented by a ray 
A" (corresponding to the unit length vector A); and another is a question 
represented as another ray B" (corresponding to the unit length vector B). 
Consider the quantum probability for the implication A B computed from 
Liider's rule. First we project the initial state tp onto the ray A" and normalize: 

tpA = MAtp/\MAtp\ = (A-At.V)/|A-At-V| =A. 

As can be seen in this special case of events based on rays, the state simply 
changes from the vector tp to the vector A. Next we compute the quantum 



29 



conditional probability for one ray given another ray: 

q{B"\A") = |MbVa|^ = |B • B+ • A|^ 



B|2-|Bt.A 



|b1' • A| 



As can be seen from the above, this is just the squared magnitude of the inner 
product between vectors A and B. However, if we repeat this procedure in the 
opposite direction for B i-^ A wc obtain = |At • B| . Thus whenever 

events A" , S" are simply rays, we obtain the equality 

q{A"\B") = |At -Bj^ = |Bt • Af = q{B"\A"). 

It is important to remember that this equality is not true in general for sub- 
spaces that have dimensions greater than one, because in this case conditional 
probability does not reduce to a single inner product. Thus quantum theory can 
explain the inverse fallacy whenever the questions arc represented as simple one 
dimensional rays. This can happen if a person relies on an oversimplified vector 
space representation to answer questions. Suppose an individual is asked a ques- 
tion such as 'disease present (or absent) given test result positive (or negative).' 
In this case, a person may represent the problem using two incompatible sets 
of projectors operating within the same two dimensional space: one set based 
on eigenvectors {D,D-'-} representing disease present or absent, and another 
based eigenvectors {T, T^} representing a positive or negative test. Given this 
oversimplified representation, the person would produce an inverse fallacy be- 
cause g(D"|T") ^ |Dt • T|^ = |Tt • D|^ ^ q{T'\D''). Note, however, that this 
oversimplified representation of events could not be used to answer questions 
about other diseases and/or combinations of test results, which would require a 
higher dimensional space. 



4.4 Order effects and response mode 

Finally we consider the order effect that occurs when conjunctions are rated first 
as compared to last (compare the circles with the dots in Figure 1). Suppose the 
constituent events, (L, H) are rated separately first. Then either one of these 
two estimates can be used later to estimate the conjunctive probability. If the 
person selects the qiL") estimate, then the conjunction can be computed from 
q{L") ■ q{H"\L") > q{L") and no conjunction error can occur. If the person 
selects the q{H'^) estimate, then the conjunction is computed from q{H'') ■ 
q{L" \H") = q{H" riL") which can exceed q{L") to produce a conjunction error. 
So if some proportion start with each estimate, then the conjunction error is 
reduced by this proportion. This reduction does not happen in the reverse 
order because in this case, the 'start with the higher probability event first' rule 
applies and the conjunction is always computed from q{H^^ fli") which produces 
conjunction errors. This order effect can also explain why ratings produce fewer 
errors than rank orders, because the latter does not request any estimates of 
the constituents ahead of time. This explains Fact 15. 
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5 Comparison with Previous Theories 



A brief comparison of the quantum model with three previous theories for con- 
junction and disjunction (Trors (and related findings) is presented below. Table 
1 provides a summary indicating whether or not each theory can explain each 
finding (y=yes, n=no, u=unknown). 



Table 1: Summary of each explanation for 15 major findings. 
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Note: R = representativeness, A = averaging, M = memory, Q = quantum. 



Tversky and Kahneman (1983) initially argued that many judgment errors 
result from the use of the representativeness heuristic, which relies on the sim- 
ilarity between the features generated by the story and the features entailed 
by the event in question. This idea agrees well with the quantum probabil- 
ity model. The representativeness heuristic was initially used to explain Facts 
1,2,3,4,7. It was never applied to Facts 5,6,8,9,10. It is at least consistent with 
Facts 11,12. This explanation fell into disfavor mainly because of Fact 13, in 
which conjunction errors occurred almost equally often for related and unre- 
lated events. For example, suppose Li ='Linda is a bank teller' and H2 = 'Bill 
is an accountant.' If J{Li A H2) > J{Li), then it is argued that this result 
cannot arise from representativeness - there is no single stereotype or prototype 
associated with the conjunction in this case. Finally, representativeness agrees 
well with Fact 14. Fact 15 is consistent with the idea that heuristic thinking is 
more likely to be evoked by ranking procedures, and analytic thinking is more 
likely to be evoked by rating scales, but this is a bit post hoc. One outstanding 
problem with the notion of representativeness is that it lacks a clear or rigorous 
formalization, which makes it difficult to determine exactly what it predicts or 
does not predict [58] . Support theory is such a formalization, but it is based on 
availability rather than representativeness [24] . It was devised to explain Facts 
5 and 6, but it has not been systematically applied to the other facts. How- 
ever, in their discussion section, Tversky and Koehler (2004) point out that one 
could use support theory to model the conjunction fallacy by viewing a ques- 
tion about event B as a 'packed' version corresponding to the unpacked event 
{B A F) \/(B A ~F). While the unpacked event must produce a greater judged 
probability than a conjunction that it contains, the packed version could pro- 
duce a smaller judged probability than a conjunction it contains. This is closely 
related to the explanation from quantum theory which uses the expansion in 
Equation 6 in a similar manner. 

A second major competing explanation is that people average the evidence 
provided by each separate event [59], and this explanation is gaining support 
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[49]. According to this theory, a person assigns a subjective probabihty to each 
component event that may appear in a combination. For example, S{A), S{B), 
and S{C) denote subjective probabilities assigned to single events A, B, C, 
which may also appear in questions about combinations of these events. Im- 
portantly, these subjective probabilities are assigned independent of the other 
events for which they are combined. For example, the same S{A) is used for 
event A, (AV _B), and (A A C). The evidence for an event composed of the 
two individual events {A, B) is formed by an average: r{AandB) denotes the 
weighted average when asked about the conjunction {Af\B), and r{AorB) de- 
notes the weighted average for the disjunction question [Ay B). For any two 
events, one is more likely than the other, and so we will let event H represent 
the more likely event and L represent the less likely event of the pair. Then the 
averages for the conjunction and disjunction are 

r{LandH) = S{L) + wi ■ {Sh - Sl) = (1 - wi) • S{L) + wi ■ S{H) (10) 
r{HorL) = S{H) + ■ {Sl - Sh) = (1 - ^^2) • S{H) + ■ S{L). 

Different weights (0 < i^i < 1) are used for different types of questions about 

combinations of events. How does one determine these weights? For conjunction 
tasks, it is assumed that people anchor on the lower probability, and then ad- 
just upward toward the higher probability producing the processing order L&i? ; 
for disjunction, people anchor on the higher probability, and adjust downward 
toward the lower probability producing the processing order HorL. This pro- 
cessing order presumably causes more weight to be placed on the lower prob- 
ability event for conjunctions; and it causes more weight to be placed on the 
higher probability event for disjunctions. This idea of anchoring and adjust- 
ment is analogous to the processing order assumptions used in the quantum 
model. Alternatively, a geometric average has been used to model these effects 
[60] , but this model makes the same ordinal predictions as the averaging model 
because the log of the geometric average equals the arithmetic average. So far 
this model has not been applied to implications and conditional probabilities, 
and so it remains unclear how to do this. The averaging model readily explains 
Facts 1,2,3. It has never been applied to Facts 4,5,6,7,8 and it is unclear how 
it would explain these. It agrees very well with Fact 9. But the averaging 
model has major problems with Facts 10 and 11 because it ignores dependen- 
cies between events. A deterministic interpretation of the averaging model also 
has problems with Fact 12: strictly speaking, a single conjunction error should 
always occur for unequal events because the average always falls between the 
two unequal values being averaged. The same problem is true for disjunction 
errors. This is a problc^m because conjunctions and disjunction errors do not 
always occur. For some people they never occur and for some pairs of questions 
they never occur. This problem can fixed partly by assuming there is noise in 
the judgment process, in which case zero and double conjunction/disjunction 
errors can occur by chance [49]. This implies that 'correct' judgments (i.e., zero 
errors) are caused by random noise and single conjunction errors are the norm 
for all people and for all pairs of questions. This model has no explanation 
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for the last two facts 14, 15. One final criticism of the averaging model is its 
lack of coherence - the weights assigned to implications are not constrained by 
assignments to conjunctions, and the model is unable to handle mutually ex- 
clusive events. For example, suppose J (A) > J{~A): then the averaging model 
implies that J{A) > J{A V ~A) > J{A A ~A) > J{~A) and so conjunction and 
disjunction errors should be frequent with mutually exclusive events. Recent 
evidence indicates that conjunction and disjunction errors are greatly reduced 
when they are formed from mutually exclusive events [51]. 

A third major explanation is that probability judgment errors result from 
a feature based memory retrieval process [61]. Two types of explanations were 
proposed, one for judgments based on stories (vignettes), and the other for judg- 
ments based on training examples, but all of the studies in our review are based 
stories (vignettes), and so we limit our discussion to the first explanation. Ac- 
cording to this model, information about the story is stored in a memory trace 
(column) vector denoted T. The coordinates of this memory vector represent 
positive or negative feature values related to the story, and zeros are assigned 
to features unrelated to the story. A single question A is represented by a probe 
(column) vector of the same length, , with values assigned to features related 
to both the question and the story, and zeros otherwise. Retrieval strength (echo 
intensity) to a question is determined by the inner product between the memory 
trace vector and the question probe vector, 1^ = [(P'a " T^)/^a\^- Note that the 
inner product is normalized by dividing it by a mimbcr, Na, that depends on 
the number of nonzero elements in the question probe vector, and the inner 
product is cubed to quench small echoes and exaggerate strong echoes. Fre- 
quency or relative frequency judgments are assumed to be proportional to echo 
intensity (which requires the intensity to be non-negative). The use of feature 
vectors to represent the memory state for stories and the probe for questions, 
as well as the use of inner products to determine probability judgments, is con- 
ceptually close to quantum probability theory (except that the inner products 
would be squared rather than cubed to produce quantum probabilities). Condi- 
tional probabilities are estimated by a two-part process of first retrieving traces 
similar to the probe, and then applying a threshold that retains only traces 
with sufficiently strong echos. The threshold mechanism is not part of Bayes' 
rule used in classic theory nor Liider's rule used in quantum theory, although 
it reduces to Bayes' rule as a special case for very low thresholds. However, for 
conjunction questions, the memory retrieval model does not assume a sequence 
of two retrievals (one retrieval for the first constituent and a second retrieval 
conditioned on the first for the other constituent, as assumed by the quantum 
model), and so it does not make use of its conditional probability mechanism for 
these types of questions. Instead, a conjunctive question H A L is represented 
by a single conjunctive probe, which is the sum (concatenation)® of the probes 
used for the two constituent question vectors, & l = Pj? + Pz,- The echo 

^Dougherty et al. described the conjunctive probe as the concatenation of two minivcctors, 
but this is the same as summing two non-overlapping vectors. If Ph is a row minivector for 
H with length Njf , and is a row minivector for L with length Nj^ , and 0^; is a row vector 
of N zeros, then [Ph\Pl] = [Ph\Onh] + [On^ \Pl]- 
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intensity of this conjunction probe produces something akin to an average, 



In particular, the above implies ^/TZ < \/IhSzL < '¥Ih i which by monotonicity 
implies 1^ < IhScL < Ih- bi short, the memory retrieval model explains the 
conjunction error (Fact 1) in the same way as the averaging model. The mem- 
ory model has an additional advantage because it provides a similarity based 
mechanism for determining the echo intensities for each question. This is useful 
for relating the features in the memory model to the semantic aspects of story 
and a single event. Dougherty et al. (1999) did not address disjunction errors 
(Facts 2,3,4), and it is unclear how the model applies to these results; but more 
recent extensions have been formulated to acount for 'unpacking' effects [62]. The 
memory model inchidcs a mechanism for computing conditional probabilities, 
which could be (but has not been) used to explain Fact 7. If this conditional 
mechanism reverses direction, it would produce a inverse fallacy, but the theory 
does not explain why people sometimes reverse the conditional, and so it does 
not really explain Fact 8. Like the averaging model, the memory model can ac- 
commodate Fact 9, but at this point it has no explicit mechanism for explaining 
event dependencies and conjimction effects (Facts 10, 11). The latter problem 
arises from the fact that the features are defined for each event separately, and 
then they are simply added (concatenated) together for conjunctions. This is 
the same problem that arises with the averaging model. Also like the averaging 
model, the memory retrieval model always predicts single conjunction errors, 
and like the averaging model, some type of error or sampling variability is re- 
quired to explain the occurrence of zero or double conjunction errors (Fact 12). 
The memory retrieval model can accomodate related and unrelated conjunction 
errors using the summation of individual vectors to represent conjunctions (Fact 
13). The memory model, being based on an inner product measure of similarity 
is also consistent with Fact 14, but it does not address response mode or order 
effects (Fact 15). 

The summary shown in Table 1 indicates that the quantum model provides 
a more comprehensive account of conjunction and disjunction errors and closely 
related phenomena in comparison with the other three theories. We do not 
conclude that the quantum model is superior in general to any of these other 
theories - they have been applied to many other phenomena beyond conjunction 
and disjunction errors that are not covered here (such as inference problems 
and base rate neglect). We simply wish to conclude that the quantum model 
provides a viable and promising new approach to understanding conjunction 
and disjunction errors and related phenomena. Future work will extend the 
model to inference [63]. 



= {P% ■ T)/Nh^l + ■ T)/Nh^l 



(11) 
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5.1 Model complexity and testability 



Quantum probability contains classic probability as a special case, and there- 
fore it is more complex than classic probability theory.^ But so are the other 
explanations for judgment fallacies, and this criticism is not unique to quantum 
theory. It is hard to say at this point which of the competing explanations is 
more complex. For example, we don't know at this point if the quantum or 
memory retrieval model is more complex. We can, however, point to places 
where this quantum model makes clear testable predictions for future research. 

The quantum model must predict that single conjunction errors only occur 
when one event has a high likelihood and the other has a low likelihood, and 
zero conjunction errors should occur when the two probabilities are equal (ex- 
cept for response errors). These predictions agree well with the results presented 
in Figure 1. The same prediction must hold for disjunction errors, which is also 
empirically supported. Quantum probability theory cannot produce a double 
conjunction error (except by response error). If future research proves that this 
phenomena is systematic and replicable, then the quantum model needs to be 
extended to provide a new principle for forming conjunctive concepts that are 
unitized and can no longer be decomposed into parts. Quantum probability 
theory predicts no conjunction or disjunction errors for complementary events 
A, ~ A (except those produced by response errors), whereas the averaging model 
predicts they will be as robust as ever. In fact, Wolfe and Reyna (2009) report a 
reduction but not elimination of conjunction and disjunction errors for comple- 
mentary events as compared to pairs that overlap in probability. The quantum 
model must predict that if events A and B are found to be compatible in a study 
on conjunctions, then these same two events A, B are predicted to produce no 
disjunction errors either. The quantum model also predicts that if events A, B 
are found to produce a conjunction error, then they must be incompatible, and 
therefore the judgments of the joint probabilities must change depending on the 
order that they are processed. Although this has not been directly tested yet, 
other evidence for order effects consistent with this prediction was presented. 
Finally, the theory makes novel predictions for conjunctions, disjunctions, and 
conditionals involving more than two events, and these predictions also can be 
derived directly from the general principles without adding any new assump- 
tions. An important step for future work is to include a more complete choice 
response model using quantum theory, and some initials steps in this direc- 
tion have been made [65]. This addition is critical for deriving quantitative 
and probabilistic rather than qualitative and deterministic predictions from the 
model. 

^ Quantum theory is not the only way to generaUze classic probability theory. An alternative 
is to describe events as open sets from a topology which replaces set theoretic complementation 
with a less stringent pseudo -complementation [64] . The latter theory has been used to explain 
'upacking' effects. 
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6 Fuzzy Reasoning Under Uncertainty 



Both classic (Kolmogorov) and quantum (von Neumann) probability theories 
are based on a coherent set of principles. In fact, classic probability theory 
is a special case of quantum probability theory in which all the events are 
compatible, which generates a simple Boolean algebra of events. Incompatible 
events produce a more complex 'partial' Boolean algebra of events [44]. So 
why do we need to use incompatible events, and isn't this irrational? In fact, 
the physical world obeys quantum principles and incompatible events are an 
essential part of nature. Clearly there are many circumstances where everyone 
agrees that the events should be treated classically (such as random selection of 
balls from urns or dice throwing). But incompatible events may be essential for 
understanding our commonly occurring but nevertheless very complex human 
interactions. For instance, when trying to judge something as uncertain as 
winning an argument with another person, the likelihood of success may depend 
on using incompatible representations that allow viewing the same facts from 
different perspectives. 

The use of incompatible events introduces a new and potentially useful con- 
cept to cognition, which is called a superposition state. This concept is funda- 
mentally different than the concept of a mixed state used in classic Bayesian 
probability theory. Consider the events depicted in Figure 2 again. Suppose that 
a voter knows that she definitely will not vote for the independent candidate 
{Z), and therefore she must vote for either the democrat (X) or the republican 
(Y). If she is in a classic mixed state at a moment before the decision, then she 
is exactly in one state (favoring a vote for democrat) or the other (favoring a 
vote for republican) and not both at that moment. If she cannot consciously 
say which state exists at that moment, she could express the probability that 
the true state is one that favors democrat or republican, but a precise state 
does exist at the moment before the decision, and the final act of voting simply 
records the immediately preexisting but possibly unknown state. If she is in 
a quantum superposition state, then she is not exactly in a state favoring the 
democrat, not exactly in a state favoring the republican, and not exactly in 
both states immediately before the vote is cast. She cannot verbally say (with 
respect to democrat and republican) exactly what the state is before the vote is 
case, because no clear state exists with regard to these two outcomes. Perhaps 
it is best characterized as a fuzzy state with regard to democrat and repub- 
lican before the measurement. The act of voting creates a clear and specific 
state (e.g. the person becomes a democratic voter after casting her vote). The 
superposition state better matches the well accepted idea that preferences are 
constructed on the spot for the purpose of making judgments or taking actions 
rather that being determined a priori [66] . 

What are the behavioral implications of this distinction between mixed ver- 
sus superposition states? Suppose the following conditional probabilities are 
known to be true for both the classical and quantum systems. (These probabil- 
ities match the situation depicted in Figure 2.) If the person is not an indepen- 
dent, then there is a .50 chance that she votes democrat or republican; if the 
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person votes democrat, then there is a .50 chance she claims to be a moderate; if 
the person votes repubUcan, then there is a .25 chance she that she claims to be 
a moderate. Now according to classic probability theory, if the person tells us 
she is not an independent, then she is going to vote for a democrat or republican 
(exclusively), and so the total probability that she will claim to be moderate (if 
asked right before casting her vote) must be (.50-.50) + (.50-.25) = .375. Accord- 
ing to the quantum system, if we learn that the person is not independent, then 
the person is in the superposed state A in Figure 2 (which is neither democrat 
nor republican), and the probability of being a moderate from this state (before 
casting the vote) turns out to be exactly zero (the vector A is orthogonal to the 
vector U for moderate in Figure 2) ! So we see that superposition states do not 
behave the same way as classic mixed states. 

The superposition concept is related to other theories of fuzzy reasoning. 
Fuzzy set theory has been useful in psychology for representing vague verbal ex- 
pressions used in natural language [67] . For instance, a vague expression such as 
'Tom is short' is represented in fuzzy set theory by a membership function that 
assigns membership values to the levels of a meter scale. This corresponds to the 
quantum superposition state with probability amplitudes assigned to eigenvec- 
tors associated with the different levels on the meter scale. Quantum theory can 
enhance fuzzy set theory by providing a more powerful formalism for evaluating 
complex combinations of expressions. Fuzzy trace theory has been useful for un- 
derstanding how people use gist versus precise representations to reason under 
uncertainty [56]. The superposition principle provides a natural way to repre- 
sent a 'gist' state as a superposition over precise values. Consider the gist 'this 
program could save lives.' This gist could be represented as a classical mixed 
state by a probability distribution over number of lives saved. Alternatively, 
it could be represented as a superposition state by an amplitude distribution 
over eigenvectors representing number of lives saved. The classic representation 
assumes that exactly one and only one number saved is the correct hypothesis, 
but we don't know which one it is, and we assign a probability to each hypoth- 
esis; whereas the quantum representation rejects the assumption that exactly 
one number is correct, and instead retains a fuzzy representation. As pointed 
out above, these two different representations of uncertainty can produce very 
different predictions for decision making behavior [68]. Quantum theory could 
be useful for formalizing some of the principles of fuzzy trace theory. 

In summary, we argue that it is important to introduce a distinction between 
compatible and incompatible representation of events when describing human 
judgments. More accurately, we should say 're-introduce' this distinction, be- 
cause Bohr actually got the idea of complementarity from William James. Hu- 
man judges may be capable of using either compatible or incompatible repre- 
sentations, and they are not constrained or forced to use just one. The use of 
compatible representations produces judgments that agree with the classic and 
Bayesian laws of probability, whereas the use of incompatible representations 
produces violations. But the latter may be necessary to deal with deeply un- 
certain situations (involving unknown joint probabilities), where one needs to 
rely on simple incompatible representations to construct sequential conjunctive 
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probabilities coherently from quantum principles. In fact, both types of repre- 
sentations, compatible and incompatible, may be available to the judge, and the 
context of a problem may trigger the use of one or the other [56] . More advanced 
versions of quantum probability theory (using a Fock space, which is analogous 
to a hierarchical Bayesian type model) provide principles for combining both 
types of representations [69]. 

7 Concluding Comments 

It is worthwhile to briefly consider how a quantum framework for cognition 

relates to the many alternative models which have been explored in the last 
few decades. A quantum approach is most closely aligned to Bayesian ap- 
proaches. In the latter, inference is guided by the updating of probabilities 
through Bayes's rule. Bayesian models have been successfully applied to many 
aspects of cognition, such as similarity [70], reasoning [71], language process- 
ing [72], and categorization [73]. As noted earlier, quantum probability theory 
has an analogue of Bayes's rule, called Liider's rule, which is used to update 
inferences. Of course, there are other key differences between quantum and 
Bayesian approaches, notably the order-dependence of operations in the former 
which are order-independent in the latter (such as conjunction and disjunc- 
tion). In this sense, a quantum approach can be thought of as a generalized 
Bayesian approach. There are also relations between quantum approaches and 
other computational paradigms for modeling cognition. For example, quantum 
computing models [74] provide parallel processing capabilities as championed by 
connectionist models [75] , but at the same time they are able to take advantage 
of condition - action procedures that match classical production rule systems 
[76]. However, learning is a new challenge for quantum information processing 
systems. On the basis of some of the relevant computational examinations in 
the literature [77], we can provisionally suggest that the main difference be- 
tween quantum models and such alternatives would be the order dependence of 
operations in quantum information processing. In the field of decision making, 
we have examined the dynamics of quantum models in more detail and have 
found evidence for interference effects which sharply contrast with that of cor- 
responding Markov models [78]. Such results indicate that a quantum approach 
to modeling cognition will produce very distinct computational predictions. 

In closing, quantum probability theory is brand new for psychologists, cog- 
nitive scientists, and decision scientists. It may seem to be a strange idea at 
first, but once familiar with it, the theory has some appealing properties for 
cognition in particular and psychology in general. On the one hand, quantum 
probability provides a powerful and coherent framework for modeling human 
judgments that compares with classic (Kolmogorov) probability theory. On the 
other hand, it provides a geometric (similarity) based approach to probability 
that provides new psychological concepts for reasoning under uncertainty, such 
as incompatible representations of events, superposition states of beliefs, and 
interference among paths to conclusions. In this article we demonstrate that 
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quantum probability theory provides a viable and promising explanation for 
conjunction and disjunction fallacies and closely related phenomena. In future 
work we plan to apply the model to inference tasks which in the past have been 
explained using Bayesian modeling frameworks [63] . 
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9 Appendix 

9.1 Derivation of quantum probabilities from projectors 

In quantum theory, an eigenvector can be a complex vector. We do not to 
limit our theory to real vectors, and all derivations allow for complex vectors, 
but the examples use only real vectors for simplicity. If V represents an n x 1 
column vector with a complex coordinate Vk = x + i • y in row k, then is 
the Hermitian transpose, which changes the column vector into a row vector 
replacing the original coordinate with its conjugate vl = x — i ■ y in row k 
{i = V—T, x,y € real). Real numbers are a special case with y = 0. An 
amplitude ipj niay also be complex, and so the squared magnitude is defined as 
\tpj\'^ = tpj-tpj . The inner product between two complex vectors V , W is a scalar 
(a complex number) whose value depends on the order: V''' • W = ^ • v* ■ Wi 
and Wt • V = w* ■ V, = (Vt • W)* but |Wt • V| = |Vt • W|. Two vectors 
V, W are orthogonal if V^^ • W = 0, and a vector is normalized if • V = 1. 
The outer product of an n x 1 vector V is the n x n matrix V • with Vi ■ Vj 
in row i column j. If M is an n x to matrix with value Vij in row i and column 
j, then the Hermitian transpose, Af^, is a to x n matrix with value v*^ in row 
i and column j. A projector M is a matrix that is Hermitian and idempotent, 
M = = M ■ M. If j4" is a subspace that is spanned by eigenvectors {Vj, 
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j G A"}, then the projector for this subspacc equals the sum of outer products 
Ma = J2jeA" '^j ■ ^1' ^ unitary matrix is a square matrix T that satisfies 
T^T = I = TT\ which is property that is needed to preserve the lengths of 
vectors. The quantum probability of an event A" can be expressed as follows: 



9(A") = |M^.V|' = 






2 






ieA" 









and because of orthogonality of eigenvectors 

= E,eAJv,.^,r = ^.^^„ Iv.M^.I^ 

and because eigenvectors are unit length 

= E,,..,i-i^ii^ = E,,.J^.i^- 

9.2 Compatible events 

First we prove that if (a) the q events in Q = {X" , F" , Z" , ...} are mutually 
exclusive and exhaustive, (b) the r events in i? = {?/", V", W", ...} are mutually 
exclusive and exhaustive, (c) the events in Q are not orthogonal to the events 
in R, but (d) the events in Q are all compatible with the events in R, then 
we require at least a q ■ r-dimensional vector space. Assumption (a) implies 
Mi ■ Mil = for i 7^ i'and ^ - Mi = I for events selected from Q; assumption 
(b) implies Mj ■ Mj' — for j ^ /and J2 = ^ fo^' events selected from R\ 
and assumption (c) implies Mj • Mj ^ for i e Q and j S i?; and assumption 
(d) implies Mj • Mj = Mj ■ Mi for any pair of events. Each matrix product 
Mjj = Mj • Mj for i G Q and j G i? is a projector because {Mi ■ Mj Y = Mj ■ Mi = 
{M,-Mj) and {M,Mj) ■ (M.Mj) = [M^M.M.M^) = [M^M.M^) = {M^M^M-) = 
MjMi = (Mj ■ Mj). Also each matrix product Mj^- = Mj • Mj for i e Q and 
j £ R projects on to at least one dimension because it is a nonzero matrix 
(Mj • Mj 7^ 0). Finally each pair of matrix products is orthogonal because 
(MiMj) ■ {M,.Mj>) = {MiMjMjiMi') = {MjMiMi^Mj,) ^ li i ^ i' or j ^ j' 
fov e Q and j, j' G R. Thus each matrix product Mjj = Mj • Mj is a projector 
that projects onto an orthogonal subspace associated with the intersection of 
events i" nf. Finally note that 1 = 11= (J2^^q M-i^ ■ (j^jeR^i) = 
J2i J2j ^ij ) ^'iid so the product matrices Mjj = Mj • Mj for z € Q and j £ 
R form a spectral resolution of the identity. The identity projects onto the 
entire vector space, and so the dimension of the vector space equals Rank{I) 
= Rank{J2iJ2j ^ij) = J2iJ2j''^^^^{^ij)' ^'^^ orthogonal projectors Mjj. If 
each product matrix has rank one (the minimum for a nonzero projector), then 
each product matrix has one eigenvector, Vjj, and the product matrix can be 
computed from the outer product of its eigenvector Mjj = Vjj • V|j. In this 
case, any vector in the vector space can be described as linear combination of 
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the eigenvectors V,j representing the joint event i" fl j"for i € Q and j € R: 

tp = I ■ i/j 

i j 

= EE«-^)-v.- 

i j 

This vector space, expressed in terms of the 'joint' eigenvectors is also 
described as a tensor product space. 

Second, we prove that if events A!\B" are compatible, then the sequential 
conjunction obeys the commutative and distributive rules. (A" Fl B") is true if 
and only if the final projection is contained in the subspace corresponding to 
the projector Mb ■ Ma] {B" □ A") is true if and only if the final projection is 
contained in the subspace corresponding to the projector Ma ■ Mb; but these 
two projectors are identical because they commute, Mb ■ Ma = Ma ■ Mb- 
Furthermore, Mb ■ Ma = Ma ■ Mb = Ma^b is the projector for the subspace 
spanned by eigenvectors that are common between the two events, which equals 
the intersection of the two subspaces, A" n B". Thus (A" n B") = (A" n B") 
for compatible events. A" n (B" U B-^) is true if and only if A" is true because 
(B" U B^) is always true; (A" Fl i?") is true if and only if the final projection 
is contained in the intersection (^4" fl B"); {A" Fl B^) is true if and only if the 
final projection is contained in the intersection (A" fl -B-*-); finally (A" □ B") U 
{A" n B^) is true if and only if the final projection is contained in (A" n i?") or 
(^4" n B^), and the latter is true if and only if the final projection is contained 
in (A"nBO U {A" r\B^) = A'\ 

Third, wc prove; that if Ma and Mb commute, then the quantum proba- 
bilities obey the classic probability (Kolmogorov) rules. Immediately above we 
proved that if Ma and Mb commute, then A" n S" = A" fl B" and therefore 
q{A'' n S") = q{A'' nB") = q{B" nA"). From this it also follows that 

q{A" U S") = 1 - q{A-^ n B-^) = 1 - q{A-^ n B"^). 

and 

q{A"\B") = IM^VbI' 

= IMaMsVIVI^sV'I' 
= q{A" nB")/q{A"). 
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If the events are commutative, then the law of total probability also holds 

q{A") = q{A"\l{BUB-^)) 

= q{A")q{B"\A")+q{A")q{B-^\A") 
= q{A" n B") + q{A" n B-^) 
= q{A'' nB")+q{A" nB^) 
= q{B"nA") + q{B-^nA''). 

9.3 Derivation for interference terms 

Next we derive Equation 6. 

q{L") = \ML^|Jf = \Ml ■ I ■ V|' = \Ml ■ {Mh + Mh^) ■ ^^l' (13) 
= i}\MH + Mh± )-Ml-Ml- {Mh + Mh± )V' 
= V^(Mh + Mh± )-Ml- {Mh + Mh± )V' 

= iP^MhMlMh + Mh±MlMh + MhMlMh± + Mh±MlMh±)'4' 
= q{H" nL") + [iIj''Mh±MlMhiP + MaMLMn^i^] + q{H^ n L") 
= q{W n i") + IntL + q{H^ n i"). 

Further analysis of the interference proves that 

IntL = iP^Mh^MlMh^P + tp^ MhMlMh±iP 

= {ijTMH±MLMHij) + {^^MH±MLMHijy 
= {tp^MH±MLMH'ip) + {'iIj''Mh±MlMh'>P)* 
= 2 • Re[V'^MjjiMz,MifV] 
= 2 • Re[{MLMH±^)^{MLMH^)]. 

It is useful to consider the simple case in which the vector space is 2 dimensional 
and all events are rays. Then Mj = VjVt and the interference term reduces to 
the special case used in [46]: 

2 • Re[{MLMH±^j)HMLMH^)] 
= 2 ■ Re[(ViV[v^^ V^W^)^(VLV[VHVt,V^)] 

= 2 • Re[(Vi(V[V^.)(V^,^))t(Vi(V[VH)(vJ,V))] 
= 2 . Re[(V[VH.)* . (V^^,^)* . (v[Vl) • (V[V^^) • (V^^V)] 
= 2 . Re[{vlv ■ {vU^r ■ (1) • (vIVh) ■ {VU)] 
= 2 . Re[(Vt^V^.)* . (V^^.V)* • (Vlv^) • {V^)] 
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If the events are compatible, then the interference is zero: 

IntL = 2 • Re[{MLMH±i^)\MLMHi^)] 
2-Re[V'^MHxMi,MHV] 
= 2 • Re[V'^ • {Mh±Mh) ■ MiV] 
= 2 • Reltp^ ■ • MlV)] = 0. 

If q{A'') > q{B"), then a conjunction error can only occur for the lower 
probability event, no matter what order is processed: 

>g(B") >q(B").g(A"|B") =<7(i3"nA"), 

If q{A) = q{B), then there can be no conjunction error: 

q{B") = q{A") > • = q{A" PS"), 

qiA") = q{B") > q{B") ■ q{B"\A") = q{B" H A"). 

Now consider the relation between IntL and IntL± . First we note that if we 
query H first, then q{H") = q{H'' r\U') + q{H" FlL^), producing no interference 
for H, so 

l = g(L") + g(L^) 

= [q{H" n L") + q{H-^ n L") + IntL] + [q{H" n L-^) + q{H^ n L-^) + IntL±] 
= q{H" n L") + q{H" n L-^) + q{H^ n L") + q{H-^ n L^) + {IntL + IntL±) 
= [q{H'' n i") + q{H" n L^)] + [q{H^ nL") + q{H^ n L-^)] 
= q{H") + q{H^) = l. 

These equalities imply that IntL + IntL^ = 0. The same argument holds for 

IntH + IntH± = 0. 

For the next property, it is useful to express the interference as follows: 

IntL = 2 • Re[tp'' Mh±MlMh^P] 

= 2 • RefV-H^ - Mh)MlMhiP] 

= 2 • Re[ip''{MLMH - MhMlMh)'^}] 

= 2 • Re[V'^Mi,MijV] - 2 • ip'' MhMlMhiI^ 

= 2 • Re[ip^ MlMhiI^] - 2 • q{H" n L") 

= 2 • {Re[(MiV)^ • (MhV)] - q{H" □ L")}. 

Satisiying both the conjunction and disjunction errors implies the following 
inequalities. The conjunction error requires IntL < —q{H^ □ L") and the 
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disjunction error requires Intfj± < —q{U^ □ H^) and the latter implies IntH > 
q{L" n ff-L).We know that IntH = 2 • {Re[(M//V)'^ (MlV)] - q{L" n iJ")}. This 
implies that we must satisfy the following constraints: 

IntH = 2 • Re[{MHip)^ (MliP)] - 2 • q{L" n H") 

> q{L" n H-^) > -q{H^ n L") 

> 2 • Re[(Mz,V)^ (MhV)] - 2 • q{H'' n L") = /ni/, 

Given the fact that Re[(M//V)^ (^lV")] = Re[(MLV)^ (M/fV)], 

we see that this 

set of constraints requires g(L"n if") = \MHML'^f < {MLMni^f = g(ff"nL") 
which is consistent with the theory when the events are incompatible. 
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